robots-txt
Here are 214 public repositories matching this topic...
advertools - online marketing productivity and analysis tools
-
Updated
Dec 19, 2024 - Python
A simple and flexible web crawler that follows the robots.txt policies and crawl delays.
-
Updated
May 19, 2021 - Go
Tame the robots crawling and indexing your Nuxt site.
-
Updated
Dec 25, 2024 - TypeScript
The robots.txt exclusion protocol implementation for Go language
-
Updated
Nov 9, 2022 - Go
A simple but powerful web crawler library for .NET
-
Updated
Dec 15, 2023 - C#
A set of reusable Java components that implement functionality common to any web crawler
-
Updated
Dec 16, 2024 - Java
Determine if a page may be crawled from robots.txt, robots meta tags and robot headers
-
Updated
Dec 23, 2024 - PHP
Opt-Out tool to check Copyright reservations in a way that even machines can understand.
-
Updated
Jan 8, 2024 - Python
Ultimate Website Sitemap Parser
-
Updated
Dec 24, 2024 - Python
Open-Source Python Based SEO Web Crawler
-
Updated
Jul 7, 2023 - Python
NodeJS robots.txt parser with support for wildcard (*) matching.
-
Updated
Oct 28, 2024 - JavaScript
Known tags and settings suggested to opt out of having your content used for AI training.
-
Updated
Jun 21, 2024 - HTML
Makes it easy to add robots.txt, sitemap and web app manifest during build to your Astro app.
-
Updated
Dec 15, 2023 - TypeScript
grobotstxt is a native Go port of Google's robots.txt parser and matcher library.
-
Updated
Mar 16, 2022 - Go
Gatsby plugin that automatically creates robots.txt for your site
-
Updated
Jan 29, 2024 - JavaScript
Simple robots.txt template. Keep unwanted robots out (disallow). White lists (allow) legitimate user-agents. Useful for all websites.
-
Updated
Feb 18, 2024
Collection of SEO utilities like sitemap, robots.txt, etc. for a Remix application. Forked from https://github.com/balavishnuvj/remix-seo
-
Updated
Dec 16, 2024 - TypeScript
🤖 A curated list of websites that restrict access to AI Agents, AI crawlers and GPTs
-
Updated
Dec 16, 2024 - Python
Improve this page
Add a description, image, and links to the robots-txt topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the robots-txt topic, visit your repo's landing page and select "manage topics."