An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: robots-txt

Johan4954/SubProbe

⚑ JavaScript-aware crawler for security researchers and bug bounty hunters. Extract hidden endpoints and internal subdomains through static and semantic analysis of JS files. Lightweight. Fast. Sneaky.

Language: JavaScript - Size: 3.01 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 1

crawler-commons/crawler-commons

A set of reusable Java components that implement functionality common to any web crawler

Language: Java - Size: 3.73 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 244 - Forks: 80

holysoles/bot-wrangler-traefik-plugin

A Traefik Middleware Plugin that helps you wrangle those pesky LLM data scrapers.

Language: Go - Size: 156 KB - Last synced at: about 6 hours ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

gigachad80/TxtRipper

Hunt robots.txt via CLI

Language: Ruby - Size: 60.5 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2 - Forks: 0

scrapy/protego

A pure-Python robots.txt parser with support for modern conventions.

Language: DIGITAL Command Language - Size: 3.42 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 65 - Forks: 28

ian-wt-blog-examples/django-robots-txt

A django project that serves a robots.txt file.

Language: Python - Size: 10.7 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

eliashaeussler/typo3-sitemap-robots

πŸ€– Extension for TYPO3 CMS to inject XML sitemaps into robots.txt

Language: PHP - Size: 1.24 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 1 - Forks: 0

GateNLP/ultimate-sitemap-parser

Ultimate Website Sitemap Parser

Language: Python - Size: 409 KB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 217 - Forks: 69

elchiconube/generate-robots-txt

The right robots.txt file for your project

Language: TypeScript - Size: 1.6 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 22 - Forks: 4

samber/the-great-gpt-firewall

πŸ€– A curated list of websites that restrict access to AI Agents, AI crawlers and GPTs

Language: Python - Size: 3.32 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 90 - Forks: 6

nuxt-modules/robots

Tame the robots crawling and indexing your Nuxt site.

Language: TypeScript - Size: 4.49 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 477 - Forks: 40

Pytlicek/AI-Data-Guard

AI Data Guard

Language: Python - Size: 53.7 KB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

eliasdabbas/advertools

advertools - online marketing productivity and analysis tools

Language: Python - Size: 23.1 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1,234 - Forks: 226

TurnerSoftware/InfinityCrawler

A simple but powerful web crawler library for .NET

Language: C# - Size: 326 KB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 252 - Forks: 37

ilgiz87/htaccess-robots.txt-joomla

Π—Π°Π³ΠΎΡ‚ΠΎΠ²ΠΊΠ° Ρ„Π°ΠΉΠ»ΠΎΠ² .htaccess ΠΈ robots.txt для Joomla

Size: 18.6 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

seantomburke/sitemapper

Parse through any sitemap in Node.js

Language: TypeScript - Size: 1.75 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 119 - Forks: 75

jwmorley73/jwm.robotstxt

Provides python access to Googles parser for robot.txt files as used by their GoogleBot webscraper.

Language: Python - Size: 160 KB - Last synced at: 1 day ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

samclarke/robots-parser

NodeJS robots.txt parser with support for wildcard (*) matching.

Language: JavaScript - Size: 506 KB - Last synced at: 3 days ago - Pushed at: 7 months ago - Stars: 156 - Forks: 19

Zorger27/zorger27.github.io

✨ This modern and adaptive Landing Page succinctly and visually presents my professional information. 🌟 The site showcases my skills in creating modern, responsive and functional web applications, highlighting attention to detail and quality. πŸš€

Language: HTML - Size: 23.2 MB - Last synced at: 4 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

bnomei/kirby3-robots-txt

Manage the robots.txt from the Kirby config file

Language: PHP - Size: 189 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 20 - Forks: 0

Zorger27/CV-Vue

πŸ‘¨β€πŸ’» This is a modern single-page application (SPA) that provides detailed and visually appealing information about me covers everything that matters: Detailed resume and Extra section (examples of completed works). πŸš€

Language: Vue - Size: 143 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

PuerkitoBio/gocrawl

Polite, slim and concurrent web crawler.

Language: Go - Size: 410 KB - Last synced at: 10 days ago - Pushed at: about 4 years ago - Stars: 2,048 - Forks: 193

sushantrahate/php-static-website-boilerplate

A lightweight, static website built with PHP, HTML, CSS, and JavaScript.

Language: PHP - Size: 8.79 KB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 0 - Forks: 0

spatie/robots-txt

Determine if a page may be crawled from robots.txt, robots meta tags and robot headers

Language: PHP - Size: 134 KB - Last synced at: 18 days ago - Pushed at: 27 days ago - Stars: 236 - Forks: 42

alexrudy/roboto

Library for robots.txt files in Rust

Language: Rust - Size: 27.3 KB - Last synced at: 6 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

temoto/robotstxt

The robots.txt exclusion protocol implementation for Go language

Language: Go - Size: 94.7 KB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 274 - Forks: 56

zvdy/parsero-go

Parsero is a free script written in Golang which reads the Robots.txt file of a web server and looks at the Disallow entries. The Disallow entries tell the search engines what directories or files hosted on a web server mustn't be indexed.

Language: Go - Size: 32.9 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 1 - Forks: 0

peaceiris/docker-images

A collection of Docker images: robotstxt, linuxbrew, gcloud, and psql

Language: Makefile - Size: 70.3 KB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 0 - Forks: 0

abdellahrk/SeoBundle

A complete SEO solution for Symfony projects. This bundle handles meta tags, Open Graph, Twitter Cards, canonical URLs, sitemaps, and moreβ€”helping your app stay search-engine friendly and socially shareable out of the box.

Language: PHP - Size: 4.67 MB - Last synced at: 21 days ago - Pushed at: about 2 months ago - Stars: 23 - Forks: 0

cokeposada/astro-full-starter

A base Astro project to start building your website quickly and efficiently.

Language: Astro - Size: 862 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 7 - Forks: 0

simplecto/sitemap_grabber

A python library to recursively crawl every sitemap.xml for a website. Also handles robots.txt and other well-knowns.

Language: Python - Size: 51.8 KB - Last synced at: 10 days ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

crwlrsoft/robots-txt

Robots Exclusion Standard/Protocol Parser for Web Crawling/Scraping

Language: PHP - Size: 32.2 KB - Last synced at: 27 days ago - Pushed at: 4 months ago - Stars: 11 - Forks: 2

beb7/gflare-tk

Open-Source Python Based SEO Web Crawler

Language: Python - Size: 39 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 172 - Forks: 18

liameno/librengine

Privacy Web Search Engine (not meta, own crawler)

Language: C++ - Size: 21.6 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 71 - Forks: 4

thedaviddias/llms-txt-hub

πŸ€– The largest directory for AI-ready documentation and tools implementing the proposed llms.txt standard

Language: TypeScript - Size: 36.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 211 - Forks: 64

alextim/astro-lib

Makes it easy to add robots.txt, sitemap and web app manifest during build to your Astro app.

Language: TypeScript - Size: 1.34 MB - Last synced at: 14 days ago - Pushed at: over 1 year ago - Stars: 118 - Forks: 6

qiubits2007/XML-Sitemap

Multi-domain XML sitemap generator with support for robots.txt, meta tags, email logging & search engine pinging

Language: PHP - Size: 167 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

arvid-berndtsson/robots-txt-analyzer

Modern robots.txt analyzer with instant analysis, security recommendations, and export capabilities. Built with Qwik and deployed on Cloudflare Pages.

Language: TypeScript - Size: 707 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

emacs-php/robots-txt-mode

Emacs major mode for editing robots.txt

Language: Emacs Lisp - Size: 11.7 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 4 - Forks: 2

s-thom/create-robots-txt-action

An action to create a robots.txt file from different sources

Language: TypeScript - Size: 1.55 MB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

mhmdiaa/waybackrobots

Enumerate old versions of robots.txt paths using Wayback Machine for content discovery

Language: Go - Size: 4.88 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 47 - Forks: 5

ACP-CODE/astro-robots

A reliable robots.txt generator for Astro projects, offering zero-config setup and Verified Bots support.

Language: TypeScript - Size: 240 KB - Last synced at: 8 days ago - Pushed at: 6 months ago - Stars: 14 - Forks: 3

CamoCatX/camocatx.github.io

The source code for Delete the Matrix blog - Exploiting, Experimenting, and Exploring the Universe

Language: SCSS - Size: 7.8 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

mguinea/laravel-robots

Laravel package to manage robots

Language: PHP - Size: 524 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 16 - Forks: 4

advanced-astro/rocketbase

πŸš€ This Astro template offers more than 'Just the Basics', providing a superior option for starting your next project wit best practices and a set of essential integrations already built-in.

Language: Astro - Size: 628 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 3 - Forks: 0

arbs09/robotstxt-honeypot

Language: Python - Size: 4.88 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

PuerkitoBio/fetchbot

A simple and flexible web crawler that follows the robots.txt policies and crawl delays.

Language: Go - Size: 2.02 MB - Last synced at: 3 days ago - Pushed at: about 4 years ago - Stars: 789 - Forks: 93

TurnerSoftware/RobotsExclusionTools

A "robots.txt" parsing and querying library for .NET

Language: C# - Size: 226 KB - Last synced at: 23 days ago - Pushed at: 9 months ago - Stars: 29 - Forks: 9

nasa-gcn/remix-seo Fork of balavishnuvj/remix-seo

Collection of SEO utilities like sitemap, robots.txt, etc. for a Remix application. Forked from https://github.com/balavishnuvj/remix-seo

Language: TypeScript - Size: 69.3 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 91 - Forks: 10

itgalaxy/generate-robotstxt

Generator robots.txt for node js

Language: JavaScript - Size: 2.86 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 66 - Forks: 8

jimsmart/grobotstxt

grobotstxt is a native Go port of Google's robots.txt parser and matcher library.

Language: Go - Size: 238 KB - Last synced at: about 2 months ago - Pushed at: about 3 years ago - Stars: 110 - Forks: 7

schliesser/sitecrawler

TYPO3 sitemap crawler

Language: PHP - Size: 80.1 KB - Last synced at: 11 days ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 1

LexiestLeszek/scrapeGPT

ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer questions based on the scraped content. The bot utilizes Retrieval Augmented Generation and webscraping to return natural language answers to the user's queries.

Language: Python - Size: 62.5 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 82 - Forks: 12

Zorger27/Couture-Metaverse

✨ A Couture Metaverse 3D is a unique platform for creating and customizing 3D models! πŸ‘— Choose colors, add textures, apply branding, and create stylish looks! πŸ•ΆοΈ Experiment, mix and customize – your design, your rules! πŸš€

Language: Vue - Size: 70.3 MB - Last synced at: 18 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

alexjc/weboptout

Opt-Out tool to check Copyright reservations in a way that even machines can understand.

Language: Python - Size: 75.2 KB - Last synced at: 20 days ago - Pushed at: over 1 year ago - Stars: 194 - Forks: 1

middlewares/robots

PSR-15 middleware to enable/disable the robots of the search engines

Language: PHP - Size: 36.1 KB - Last synced at: 3 days ago - Pushed at: 2 months ago - Stars: 3 - Forks: 1

OwenOrcan/YiraBot-Crawler

YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, offering command-line ease and Python integration. Ideal for research, SEO, and data collection.

Language: Python - Size: 221 KB - Last synced at: 7 days ago - Pushed at: 7 months ago - Stars: 19 - Forks: 0

hrbrmstr/spiderbar

Lightweight R wrapper around rep-cpp for robot.txt (Robots Exclusion Protocol) parsing and path testing in R

Language: C++ - Size: 88.9 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 9 - Forks: 2

aafeher/go-sitemap-parser

Go language library for parsing Sitemaps

Language: Go - Size: 103 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 4 - Forks: 1

glyn/nginx_robot_access

NGINX robot access module

Language: Rust - Size: 109 KB - Last synced at: 5 days ago - Pushed at: 3 months ago - Stars: 6 - Forks: 0

raminf/RoboNope-nginx

Take control of your own content. Enforce access to disallowed web URLs.

Language: C - Size: 948 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

Zorger27/Vue-Threejs-2

πŸͺ CuboVerse (a combination of the Spanish word "cubo" (meaning "cube") and the English word "universe" (meaning "universe"), i.e., "the universe of cubes") is an interactive 3D platform that allows users to observe and control various cube models in a virtual space.

Language: Vue - Size: 15.7 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

chrisakroyd/robots-txt-parser

A lightweight robots.txt parser for Node.js with support for wildcards, caching and promises.

Language: JavaScript - Size: 71.3 KB - Last synced at: 1 day ago - Pushed at: almost 2 years ago - Stars: 13 - Forks: 9

BeardedFish/vscode-robots-dot-txt-support

An extension for Visual Studio Code that enables support for robots.txt files. πŸ€–

Language: TypeScript - Size: 159 KB - Last synced at: 25 days ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

schnti/kirby-robots

Kirby CMS plugin that adds a route for robots.txt

Language: PHP - Size: 2.93 KB - Last synced at: 19 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

simonw/datasette-block-robots

Datasette plugin that blocks robots and crawlers using robots.txt

Language: Python - Size: 20.5 KB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 0

kyr0/astro-launchpad

An Astro project template for decent projects: auth, i18next, Bootstrap, sitemap, webworker, robots.txt, preact, react, endpoints, endpoint clients, OAuth, various Astro features and data loading preconfigured

Language: CSS - Size: 11.6 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 53 - Forks: 13

tawhidurrahmandear/robots.txt-generator

robots.txt Generator 🌐 Live Preview at https://www.devilhunter.net/p/robotstxt-generator.html

Language: JavaScript - Size: 25.4 KB - Last synced at: 10 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

VIPnytt/UserAgentParser

User-Agent parser for robots.txt, X-Robots-tag and Robots-meta-tag rule sets

Language: PHP - Size: 38.1 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 2

Zorger27/3dConfigurator

πŸ‘‘ A 3D configurator is an innovative online technology that enables users to interact with 3D product models in real-time. πŸ’Ž It’s a powerful tool for businesses that allows your customers to customize products to their preferences. 🌈

Language: Vue - Size: 15.4 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

jonasjacek/robots.txt

Simple robots.txt template. Keep unwanted robots out (disallow). White lists (allow) legitimate user-agents. Useful for all websites.

Size: 135 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 87 - Forks: 38

Cyb3r3x3r/Chanakya

Scan websites for multiple things like honeypot, whois , port scan etc...

Language: Python - Size: 36.1 KB - Last synced at: 27 days ago - Pushed at: 8 months ago - Stars: 6 - Forks: 1

mehdiraized/advanced-seo-toolkit

Advanced SEO Toolkit is a comprehensive solution for optimizing your WordPress site for search engines

Language: PHP - Size: 86.9 KB - Last synced at: 4 days ago - Pushed at: 8 months ago - Stars: 1 - Forks: 1

p0dalirius/RobotsValidator

A python script to check if URLs are allowed or disallowed by a robots.txt file.

Language: Python - Size: 190 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 21 - Forks: 2

nicholasbergesen/robots-parser

Parse robots.txt and traverse sitemaps.

Language: C# - Size: 6.6 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 2

Sobak/scrawler

Declarative, scriptable web robot (crawler) and scrapper

Language: PHP - Size: 248 KB - Last synced at: 2 months ago - Pushed at: about 5 years ago - Stars: 9 - Forks: 1

rix4uni/robotxt

Extract endpoints marked as Allow and Disallow in robots.txt

Language: Go - Size: 1.95 KB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 1 - Forks: 1

schnti/kirby2-robots πŸ“¦

Kirby 2 CMS plugin that adds a route for robots.txt

Language: PHP - Size: 2.93 KB - Last synced at: 4 months ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 0

kritihq/robotstxt-generator

A simple HTML + JS web app that generate robots.txt file

Language: HTML - Size: 9.77 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

ekalinin/robots.js

Parser for robots.txt for node.js

Language: JavaScript - Size: 71.3 KB - Last synced at: 19 days ago - Pushed at: about 4 years ago - Stars: 67 - Forks: 21

momenbasel/pyrobots

a tool that gets all paths at robots.txt and opens it in the browser.

Language: Python - Size: 462 KB - Last synced at: 6 days ago - Pushed at: almost 6 years ago - Stars: 14 - Forks: 7

AleksandrHovhannisyan/eleventy-plugin-robotstxt

Generate a robots.txt file for your Eleventy site

Language: JavaScript - Size: 69.3 KB - Last synced at: 4 days ago - Pushed at: 10 months ago - Stars: 9 - Forks: 0

Zorger27/Currencies

πŸ’Ή This modern web application is designed to provide a convenient and visual display of current foreign exchange rates. The app retrieves real-time data from the National Bank of Ukraine API, displaying exchange rates relative to the Ukrainian hryvnia (UAH) in innovative and interactive formats.

Language: Vue - Size: 9.34 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Zorger27/Vue-TS-Template

🌐 A powerful starter template for developing Single Page Applications (SPA), perfectly suited for modern web projects. πŸš€ Easy customization and scalability make this template the perfect foundation for building complex and feature-rich applications.

Language: Vue - Size: 11.7 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Zorger27/Vue-Threejs-1

🌐 A web application with a dynamic and interactive 3D model, built using the advanced Three.js library. 🎲 The colorful rotating cube allows users to take full control of the model.

Language: Vue - Size: 16 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Zorger27/Vue-JS-Template

🌐 A convenient and flexible starter template for building modern single-page applications (SPA). πŸš€ With its intuitive structure and code flexibility, this template is perfect for quickly starting new projects or developing complex web applications! 🌟

Language: Vue - Size: 4.39 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Zorger27/Vue-Threejs-Template

🌐 A convenient and powerful starter template for creating modern Single Page Applications (SPA) with 3D graphics integration. πŸš€ Thanks to its flexibility and scalability, this solution is perfect for any web project that supports 3D graphics! πŸ’»πŸŒŸ

Language: Vue - Size: 9.64 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Zorger27/Cryptocurrencies

πŸ’Ž Cryptocurrencies in Real-Time β€” Interactive and Visual! πŸš€ This web application transforms cryptocurrency rate monitoring into an engaging experience. πŸ“ˆ Using the CoinGecko API, it displays up-to-date cryptocurrency rates in USD, offering users flexibility and interactivity.

Language: Vue - Size: 11.5 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Zorger27/Weather

β˜€οΈ Custom-built weather forecasting web app that delivers real-time data from OpenWeather for any city worldwide. 🌈 Whether you're a tech enthusiast or just curious about the weather, this app has something for everyone! ⛄️

Language: Vue - Size: 8.29 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

k3ldar/.NetCorePluginManager

.Net Core Plugin Manager, extend web applications using plugin technology enabling true SOLID and DRY principles when developing applications

Language: C# - Size: 15.3 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 31 - Forks: 19

alejsanc/nexttypes

NextTypes is a standards based information storage, processing and transmission system that integrates the characteristics of other systems such as databases, programming languages, communication protocols, file systems, document managers, operating systems, frameworks, file formats and hardware in a single tightly integrated system using a common data types system.

Language: Java - Size: 9.56 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 6 - Forks: 0

AntoineGagne/robots

A parser for robots.txt with support for wildcards. See also RFC 9309.

Language: Erlang - Size: 30.3 KB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 2

vxern/robots_txt

βš™οΈ A quality `robots.txt` ruleset parser to ensure your application follows the standard specification for the file.

Language: Dart - Size: 46.9 KB - Last synced at: 2 days ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

9dl/RobotsSniffer

Tool to analyze and parse website robots.txt for crawler rules.

Language: C# - Size: 48.8 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

ChrisWinters/multisite-robotstxt-manager

A Multisite Robots.txt Manager - Quickly and easily manage all robots.txt files on a WordPress Multisite Website Network.

Language: PHP - Size: 2.36 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 11 - Forks: 6

adileo/MicroFrontier

A lightweight crawler frontier implementation in TypeScript using Redis.

Language: TypeScript - Size: 263 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 0

healsdata/ai-training-opt-out

Known tags and settings suggested to opt out of having your content used for AI training.

Language: HTML - Size: 40 KB - Last synced at: 7 months ago - Pushed at: 12 months ago - Stars: 130 - Forks: 3

folini/Page-Auditor

"Page Auditor for Technical SEO" is an open source Google Chrome Extension created by Franco Folini. Once you added Page Auditor to your browser it will let you explore and analyze Structured Data, JavaScript scripts, Meta-Tags, Robots.txt and Sitemap.xml files from any webpage. All these elements are critical to improve the on-page SEO.

Language: TypeScript - Size: 9.64 MB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 3

gen-x-coder/sitemap-admin

Robots.txt and sitemap.xml generator

Language: PHP - Size: 15.6 KB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

shantihedelin/Movie-night

Movie web-application. Inspiration from HemmakvΓ€lls website. Created with Vite, implementing Redux, SEO, and tests with Cypress. Using TMDB api. Styling under progress.

Language: JavaScript - Size: 2.32 MB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0