An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: robots-txt

ProgressPlanner/eco-friendly-robots-txt

Optimizes your site's robots.txt to reduce server load and CO2 footprint by blocking unnecessary crawlers while allowing major search engines and specific tools.

Language: PHP - Size: 64.5 KB - Last synced at: about 11 hours ago - Pushed at: 7 months ago - Stars: 12 - Forks: 0

fooock/robots.txt

:robot: robots.txt as a service. Crawls robots.txt files, downloads and parses them to check rules through an API

Language: Java - Size: 1.92 MB - Last synced at: 2 months ago - Pushed at: over 4 years ago - Stars: 16 - Forks: 2

Lexxrt/Blue

🕵️‍♂️ɪɴғᴏʀᴍᴀᴛɪᴏɴ ɢᴀᴛʜᴇʀɪɴɢ ᴛᴏᴏʟ🕵️‍♂️

Language: Python - Size: 657 KB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 10 - Forks: 3

ursnj/seo-master

SEO Master is a powerful all-in-one tool developed to boost your website's visibility and rankings. With features like automatic sitemap generation, customizable robots.txt creation, SEO-optimized metadata, Image assets generation and seamless integration with major search engines.

Language: TypeScript - Size: 162 KB - Last synced at: 29 days ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

EngincanV/SeoHelper

This package helps you to add meta-tags, sitemap.xml and robots.txt into your project easily.

Language: C# - Size: 35.2 KB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 24 - Forks: 2

ecnepsnai/Robots.txt-Block-AI

A robots.txt to ask AI from stealing your content

Size: 1000 Bytes - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 0

figuren-theater/ft-seo

Integrations dedicated to search engines and social media plattforms for all sites of the WordPress multisite network figuren.theater

Language: PHP - Size: 160 KB - Last synced at: 1 day ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

ptsochantaris/can-proceed

A small, tested, no-frills parser of robots.txt files in Swift.

Language: Swift - Size: 26.4 KB - Last synced at: 19 days ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

GeekInTheNorth/Stott.Optimizely.RobotsHandler

An admin extension for Optimizely CMS 12+ for managing robots.txt on a per site basis.

Language: C# - Size: 118 MB - Last synced at: 27 days ago - Pushed at: 8 months ago - Stars: 3 - Forks: 3

tractorcow/silverstripe-robots

Simple robots generation module for Silverstripe (SS 4 and above)

Language: PHP - Size: 24.4 KB - Last synced at: 5 days ago - Pushed at: almost 2 years ago - Stars: 13 - Forks: 6

muratgozel/robotstxt-util

RFC 9309 spec compliant robots.txt builder and parser. 🦾 No dependencies, fully typed.

Language: TypeScript - Size: 136 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 3 - Forks: 1

kevinbenabdelhak/WP-Robots-txt-Editor

WP Robots txt Editor est un plugin WordPress idéal pour modifier le fichier robots.txt à partir d'une simple page d'option. Générez un robots.txt par défaut et accédez à de nombreuses fonctionnalités comme le choix des publications, catégories, pages parents/enfants, et plus encore..

Language: PHP - Size: 30.3 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

Sire-NILESH/next-seo

An app to practice SEO associated with SSG, ISR, SSR and other featues like dynamic 'Metadata', 'Robots', 'Sitemap', cache, 'Favicon', 'Open Graph' and 'Social Preview' in NextJS14.

Language: TypeScript - Size: 222 KB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

spences10/robots-txt-syntax-highlighting

robots.txt syntax highlighting for VS Code

Size: 230 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

PabloSanchez87/Web_scrapping_chatbot

RAG basado en la plataforma Streamlit, que utiliza la API de OpenAI y LangChain para generar respuestas contextuales basadas en una base de datos de informes y conocimientos vectorizada.

Language: Python - Size: 1.91 MB - Last synced at: 11 days ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

thefrosty/wp-block-ai-scrapers

Block all known AI Data Scrapers.

Language: PHP - Size: 350 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

leshniak/robotstxt-debug

A tool for debugging robots.txt

Language: JavaScript - Size: 4.88 KB - Last synced at: 26 days ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

alpozkanm/alpozkan.info

Website of Alp Ozkan: Product Manager living in Istanbul, passionate about UX, Software Development and Web3.

Language: TypeScript - Size: 17.9 MB - Last synced at: 10 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

ranvis/robots-txt-processor

robots.txt filter and tester for untrusted source

Language: PHP - Size: 52.7 KB - Last synced at: 7 months ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 1

MLArtist/WebScraper

Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.

Language: Python - Size: 43.9 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 61 - Forks: 14

t1gor/Robots.txt-Parser-Class

Php class for robots.txt parse

Language: PHP - Size: 658 KB - Last synced at: 16 days ago - Pushed at: over 2 years ago - Stars: 83 - Forks: 28

VIPnytt/RobotsTxtParser

An extensible robots.txt parser and client library, with full support for every directive and specification.

Language: PHP - Size: 526 KB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 26 - Forks: 6

Cmastris/robotstxt-change-monitor

Monitor and report changes across one or more robots.txt files.

Language: Python - Size: 76.2 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 9 - Forks: 1

ali-habibzadeh/top10-seo-list-for-developers

The top 10 things developers need to know about SEO

Language: HTML - Size: 182 KB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 3

simtechdev/cs-cart-robots-txt

robots.txt template file for CS-Cart

Size: 5.86 KB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 1

marcortola/behat-seo-contexts

Behat extension for testing some On-Page SEO factors: meta title/description, canonical, hreflang, meta robots, robots.txt, redirects, sitemap validation, HTML validation, performance...

Language: PHP - Size: 150 KB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 13 - Forks: 4

nickserv/crediblock

Open source robots.txt denylist for uncredited AI crawlers

Size: 330 KB - Last synced at: about 10 hours ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

ravern/gollum

Robots.txt parser and fetcher for Elixir

Language: Elixir - Size: 29.3 KB - Last synced at: 12 days ago - Pushed at: about 2 years ago - Stars: 14 - Forks: 11

William-Fernandes252/astel

An asyncronous web crawling library for Python.

Language: Python - Size: 1.02 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ameygawade/streamlit-robots_txt_generator

This Streamlit app allows users to generate and customize a robots.txt file by selecting user-agents, specifying disallowed paths, enabling crawler delay, and providing a sitemap URL.

Language: Python - Size: 98.3 MB - Last synced at: 6 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

samclarke/robotstxt

Go robots.txt parser

Language: Go - Size: 17.6 KB - Last synced at: 12 months ago - Pushed at: over 7 years ago - Stars: 16 - Forks: 7

prod3v3loper/php-honeypot-robots-bait

🍯 HoneyPot Robots Bait

Language: PHP - Size: 36.1 KB - Last synced at: 26 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

toimik/RobotsProtocol

Parsers for robots.txt (aka Robots Exclusion Standard / Robots Exclusion Protocol), Robots Meta Tag, and X-Robots-Tag

Language: C# - Size: 198 KB - Last synced at: 25 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

WaluigiBSOD/waluigibsod.github.io

My (small and raw) personal website.

Language: HTML - Size: 44.9 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

iis2h/RobotSeeker

Fast and reliable python tool that grabs robots.txt files from a bunch of subdomains asynchronously

Language: Python - Size: 26.4 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

mdreizin/gatsby-plugin-robots-txt

Gatsby plugin that automatically creates robots.txt for your site

Language: JavaScript - Size: 3.92 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 106 - Forks: 27

codeAdrian/egghead-challenge-awesome-seo

Awesome SEO snippets - Head markup, robots.txt, sitemap.xml & Google Schema

Size: 9.77 KB - Last synced at: 26 days ago - Pushed at: over 4 years ago - Stars: 8 - Forks: 0

TahaT80/Robots_Scanner

Robots Scanner

Language: Python - Size: 33.2 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

ajayjsr/bloggerrobotstxtgenerator

The Blogger Robots.txt Generatois a tool designed to simplify the process of creating a robots.txt file for websites hosted on the Blogger platform. A robots.txt file is crucial for controlling how search engines index your site. This generator allows users to customize and generate a robots.txt file tailored to their specific needs.

Language: HTML - Size: 151 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

A3onn/mapptth

A simple to use multi-threaded web-crawler written in C with libcURL and Lexbor.

Language: C - Size: 205 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

enishant/domain-for-sale

This is ready to use template to quickly start selling domain with minimum setup.

Language: PHP - Size: 6.84 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

serpwings/pyrobotstxt

pyrobotstxt: Python Package for robots.txt Files

Language: Python - Size: 2.75 MB - Last synced at: 10 months ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

apchavan/InfoPuller

Helpful CLI application to fetch useful details about website domains or local machine, using the core Windows OS functions.

Size: 1.79 MB - Last synced at: 3 months ago - Pushed at: almost 4 years ago - Stars: 3 - Forks: 0

andogq/robots.txt-tools 📦

A simple script to open all the pages in a website's robots.txt files

Language: JavaScript - Size: 18.6 KB - Last synced at: about 1 year ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0

CrawlCtrl/CrawlCtrl.Reader

Convenient wrapper for the CrawlCtrl deserializer.

Size: 87.9 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

MaximeGuinard/Robots.txt-Viewer

🌐 Displays the contents of robots.txt and sitemap.xml files of a website google extension

Language: JavaScript - Size: 4.88 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

herrbischoff/robots.txt

A sane, minimal robots.txt file (for the western world)

Size: 1000 Bytes - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 1

itgalaxy/robotstxt-webpack-plugin

A webpack plugin to generate a robots.txt file

Language: JavaScript - Size: 784 KB - Last synced at: 11 days ago - Pushed at: about 2 years ago - Stars: 35 - Forks: 7

jirkapinkas/jsitemapgenerator

Java sitemap generator. This library generates a web sitemap, can ping Google, generate RSS feed, robots.txt and more with friendly, easy to use Java 8 functional style of programming

Language: Java - Size: 256 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 39 - Forks: 12

Zer0Divis0r/rebuild-cache-sitemap

GitHub action to rebuild CDN cache according to sitemaps

Language: Shell - Size: 9.77 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

alexander-irbis/robots_txt

Lightweight robots.txt parser and generator written in Rust.

Language: Rust - Size: 42 KB - Last synced at: 10 months ago - Pushed at: over 4 years ago - Stars: 12 - Forks: 5

SixArm/sixarm_apache_robots_txt

SixArm.com » Apache webserver » robots.txt configuration file

Size: 2.93 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

matheuscostadesign/guia-seo

Check-list para reunir as principais tags a serem adicionadas na criação de páginas HTML para que os motores de busca façam a indexação do site de forma orgânica.

Size: 81.1 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 1

rihenperry/csuci-mscs-thesis-dist-web-crawler

documents my master's level thesis work on building continous, topical web crawler based on mercator 1999

Language: TeX - Size: 27.4 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

FiftyDeg/SyliusRobotsPlugin

Fifty Deg robots plugin for Sylius.

Language: PHP - Size: 423 KB - Last synced at: 1 day ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

PhrozenByte/pico-robots

This is Pico's official robots plugin to add a robots.txt and sitemap.xml to your website. Pico is a stupidly simple, blazing fast, flat file CMS.

Language: PHP - Size: 18.6 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 8 - Forks: 1

Keyvanhardani/Robots.txt-finder

Robots.txt Finder V1 by Keyvan Hardani

Language: Python - Size: 5.86 KB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

gbenson/ultimate-sitemap-parser Fork of mediacloud/ultimate-sitemap-parser

Website sitemap parser

Language: Python - Size: 115 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

gcyrillus/extra_SEO

outils pour aider au referencement

Language: PHP - Size: 239 KB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 1

dubniczky/Bad-Robot

This is a python crawler that disregards robots.txt rules and downloads disallowed resources

Language: Python - Size: 16.6 KB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

akashblackhat/dark_web.py

Dark Web Informationgathering Footprinting Scanner and Recon Tool Release. Dark Web is an Information Gathering Tool I made in python 3. To run Dark Web, it only needs a domain or ip. Dark Web can work with any Linux distros if they support Python 3. Author: AKASHBLACKHAT(help for ethical hackers)

Language: Python - Size: 115 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 47 - Forks: 5

sakuagl/RobotsChecker

Reads the robots.txt file to determine if the specified URL is accessible

Language: Python - Size: 5.86 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

rimiti/robotizer

Robots.txt parser / generator

Language: TypeScript - Size: 242 KB - Last synced at: 21 days ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

KesarBrown/noindex

'noindex' is a movement for drawing soft boundaries on internet for search engines and generative AI crawlers.

Size: 3.91 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

lucas-bogos/virtual-store

Ecommerce construido para um projeto integrador da faculdade

Language: SCSS - Size: 1.49 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 0

LuXDAmore/nuxt-humans-txt

🧑🏻👩🏻 "We are people, not machines" - An initiative to know the creators of a website. Contains the information about humans to the web building - A Nuxt Module to statically integrate and generate a humans.txt author file - Based on the HumansTxt Project.

Language: JavaScript - Size: 3.9 MB - Last synced at: 23 days ago - Pushed at: over 3 years ago - Stars: 29 - Forks: 1

merwin-asm/Robots.TXT

Robots.txt parser for python || Better than the OG one for some reasons

Language: Python - Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

tberlin-om/robots-txt-checker

robots.txt checker/monitoring - The python-script checks the robots.txt content and statuscode and sents an emails if the check fails

Language: Python - Size: 5.86 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Innmind/Robots.txt

Robots.txt parser

Language: PHP - Size: 176 KB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

visiontechventures/sitemapcrawler

Python scripts to crawl and collect all URLs from a website, get robots.txt and download sitemaps.

Language: Python - Size: 329 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

ChrisWinters/robotstxt-manager

A Simple Robots.txt Manager Plugin For Wordpress.

Language: PHP - Size: 4.12 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

MikeTheHash/RobotsViewer

A tool to view the robots.txt file in web applications for the beginners!

Language: Python - Size: 3.91 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

stormid/robotify-netcore

Provides robots.txt middleware for .NET core

Language: C# - Size: 70.3 KB - Last synced at: 28 days ago - Pushed at: almost 3 years ago - Stars: 16 - Forks: 5

dosbenjamin/eleventy-webpack-boilerplate 📦

Front-end workflow to start a new project with Eleventy and Webpack.

Language: JavaScript - Size: 5.78 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

eliasdabbas/robotstxt_app

Visual App for Testing URLs and User-agents blocked by robots.txt Files

Language: Python - Size: 24.4 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

bnomei/kirby-robots-writer 📦

Robots for Kirby CMS

Language: PHP - Size: 9.77 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 8 - Forks: 2

brutuscat/medusa Fork of chriskite/anemone 📦

- THIS IS AN OLD FORK - Checkout Medusa Crawler gem instead "medusa-crawler"

Language: Ruby - Size: 319 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 18 - Forks: 9

cansin/next-with-sitemap 📦

Higher order Next.js config to generate sitemap.xml and robots.txt

Language: JavaScript - Size: 1.27 MB - Last synced at: 8 days ago - Pushed at: over 3 years ago - Stars: 10 - Forks: 3

JamieMagee/robots-txt 📦

Language: Jupyter Notebook - Size: 9.39 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

delltaxa/atlas

Website scanner

Language: Python - Size: 48.8 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

emanuelefavero/robots-txt-templates-

This is a collection of robots.txt templates

Size: 0 Bytes - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

larevanchedessites/google-robotstxt-ruby

🤖 Ruby gem wrapper around Google Robotstxt Parser C++ library

Language: Ruby - Size: 18.6 KB - Last synced at: 5 days ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 3

mdthansil/sharp-seo-tools

Sharp SEO Tools is collection of free web tools completely written in Javascript (19 tools available), feel free to use

Language: JavaScript - Size: 1.66 MB - Last synced at: 27 days ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

0xIbra/robots-txt-component

Fully native robots.txt parsing component without any dependencies.

Language: JavaScript - Size: 19.5 KB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

hvlck/robots

robots.txt parser

Language: Go - Size: 10.7 KB - Last synced at: 12 months ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

jaredhanson/kerouac-robotstxt

robots.txt middleware for Kerouac.

Language: JavaScript - Size: 62.5 KB - Last synced at: 28 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

stovv/next-strapi-sitemap

Generate sitemap and robots.txt for NextJS used web hook from STRAPI

Language: JavaScript - Size: 82 KB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 6 - Forks: 2

michielvoo/robots.txt

PowerShell module for reading robots.txt files

Language: C# - Size: 7.81 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

razorcreations/nova-robots-editor

This Laravel Nova Tool gives your admins the ability to edit the robots.txt file from within the Nova control panel.

Language: PHP - Size: 29.3 KB - Last synced at: 20 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 2

callumbwhyte/friendly-robots

A friendly tool for creating dynamic robots.txt files in Umbraco

Language: C# - Size: 109 KB - Last synced at: 28 days ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 1

Becklyn/robots-txt

A package for generating a robots.txt programmatically.

Language: PHP - Size: 22.5 KB - Last synced at: 26 days ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

austinsonger/sitemapsandrobotsaroundtheweb

Sitemaps and Robots.txt for websites around the world.

Size: 137 KB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 0

b4dnewz/robots-parse

A lightweight and simple robots.txt parser in node

Language: TypeScript - Size: 647 KB - Last synced at: 12 days ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 0

SteeinRu/php-robots

Generator robots.txt

Language: PHP - Size: 8.79 KB - Last synced at: 2 months ago - Pushed at: almost 8 years ago - Stars: 2 - Forks: 2

vedranvinko/robots

robots.txt

Language: Rust - Size: 14.6 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

hufslion9th/MissingSemester_Crawling

2021 HUFS Missing Semester : Crawling

Language: Python - Size: 6.35 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 4 - Forks: 0

ossama131/Bias-to-Search-Engines-from-Robots.txt

Determining bias to search engines from Robots.txt

Language: Jupyter Notebook - Size: 224 KB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

php-middleware/block-robots

Middleware to avoid search engine indexing with PSR-7 using robots.txt and X-Robots-Tag

Language: PHP - Size: 13.7 KB - Last synced at: about 2 months ago - Pushed at: about 4 years ago - Stars: 5 - Forks: 0

useflyyer/robots Fork of ArcaneDigital/parse-robots

Super lightweight plain TypeScript parser for robots.txt with 0 dependencies.

Language: TypeScript - Size: 320 KB - Last synced at: 16 days ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

slemarchand/no-robots

🚫🤖 Override /robots.txt to disallow all web crawlers, regardless settings stored in the database. Compatible with Liferay 7.0, 7.1, 7.2, 7.3 and 7.4.

Language: Java - Size: 55.7 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0