A robots.txt file instructs bots (also known as robots, spiders, or web crawlers) on how they should interact with your site. You can add rules to manage bot access to specific pages, folders, or your entire site. It's typically used to list pages or folders on your site that you don't want search engines to crawl or index.
Just like a sitemap, the robots.txt file lives in the top-level directory of your domain, e.g., yourdomain.com/robots.txt.
Note
Your site's robots.txt file is publicly visible — avoid relying on it to protect sensitive or private content, as anyone can view which pages or folders you're trying to restrict.
Generate a robots.txt file
To generate a robots.txt file:
- Go to Site settings > SEO > Indexing
- Add the
robots.txt rule(s) you want
- Click Save and publish your site
robots.txt rules
Each rule in your robots.txt file generally consists of two main parts:
-
User-agent: Identifies which bots the rule applies to (e.g.,
* for all bots, or Googlebot for Google's web crawler).
-
Disallow / Allow: Defines which of your site's pages bots can or can't access.
Disallow blocks access, while Allow explicitly permits access. Using / applies the rule to all pages on your site.
Important
Not all bots follow the rules specified in your robots.txt file, especially malicious or poorly configured ones. As a result, these bots may still access your site, including restricted folders and pages. Search the list of good/verified bots.
| Rule |
Example |
| Block all bots from your entire site |
User-agent: *
Disallow: /
|
| Block all bots from a specific page |
This rule blocks all bots from crawling yourdomain.com/secret-page:
User-agent: *
Disallow: /secret-page
|
| Block all bots from a specific folder (e.g., a Collection or set of related pages) |
This rule blocks all bots from crawling all pages in yourdomain.com/secret-folder:
User-agent: *
Disallow: /secret-folder/
|
| Allow a specific bot and block all others from your entire site |
This rule blocks all bots except Googlebot from crawling your entire site:
User-agent: Googlebot
Allow: /
User-agent: *
Disallow: /
|
| Allow multiple specific bots and block all others from your entire site |
This rule allows Googlebot and UptimeRobot to crawl your site and blocks all others:
User-agent: Googlebot
Allow: /
User-agent: UptimeRobot
Allow: /
User-agent: *
Disallow: /
|
| Allow all bots but disallow a specific bot from your entire site |
This rule allows all bots to crawl your entire site except BadBot:
User-agent: BadBot
Disallow: /
User-agent: *
Allow: /
|
| Block all bots from a particular page but allow it to access other pages |
This rule blocks all bots from crawling yourdomain.com/secret-page but allows them to crawl other pages on your site:
User-agent: *
Disallow: /secret-page
Allow: /
|
| Block a specific bot from a particular page but allow it to access other pages |
This rule blocks Googlebot from crawling yourdomain.com/secret-page but allows it to crawl other pages on your site:
User-agent: Googlebot
Disallow: /specific-page
Allow: /
|
Include a sitemap in robots.txt
|
Sitemap: https://yourdomain.com/sitemap.xml
Note
Webflow adds a link to your sitemap in your robots.txt by default. You can remove your sitemap from your robots.txt by toggling Remove sitemap.xml from robots.txt to on.
|
| Signal AI content usage preferences |
Content-Signal: ai-train=no, search=yes, ai-input=no |
Content-Signal header
In addition to robots.txt rules, Webflow sites can serve a Content-Signal HTTP header to give AI crawlers more granular instructions about how your content can be used. While robots.txt controls whether bots can access your pages, Content-Signal lets you specify whether your content can be used for AI training, search indexing, or AI-generated responses separately.
To configure the Content-Signal header, go to Site settings > SEO > Indexing.
Example:
Content-Signal: ai-train=no, search=yes, ai-input=no
Note
This is based on a proposed extension to RFC 9309 and is not yet an accepted standard.
To include a sitemap
Sitemap: https://your-site.com/sitemap.xml
Note
Webflow adds a link to your sitemap in your robots.txt by default. You can remove your sitemap from robots.txt by toggling Remove sitemap.xml from robots.txt to on.
Learn more about robots.txt.