Disable search engine indexing

Updated

Prevent search engines from indexing pages, folders, your entire site, or just your webflow.io subdomain.

You can control which pages search engines crawl on your site by writing a robots.txt file or by adding a noindex tag to certain pages. Then, you can prevent search engines from crawling and indexing specific pages, folders, your entire site, or your webflow.io subdomain. This is useful for hiding pages — like your site’s 404 page — from being indexed and listed in search results.

Important

Content from your site may still be indexed, even if it hasn’t been crawled. That happens when a search engine finds your content either because it was published previously, or there’s a link to that content from other content online. To ensure that a previously indexed page is not indexed, don’t add it in the robots.txt. Instead, use the Sitemap indexing toggle to remove that content from Google’s index. You can also use Google’s removals tool.

How to disable indexing of the Webflow subdomain

You can prevent Google and other search engines from indexing your site’s webflow.io subdomain by disabling indexing from your Site settings.

  1. Go to Site settings > SEO > Indexing section
  2. Set Staging indexing to off
  3. Click Save and publish your site

This will publish a unique robots.txt only on the subdomain that tells search engines to ignore this domain. 

Note

You’ll need a Site plan or paid Workspace to disable search engine indexing of the Webflow subdomain.

How to enable or disable indexing of site pages

There are two ways to disable indexing of site pages:

  • By using the Sitemap indexing toggle in Page settings
  • By generating a robots.txt file

Note that if you disable indexing of a site page via a robots.txt file, the page will still be included in your site’s auto-generated sitemap (if you’ve enabled the sitemap). Additionally, if you’ve previously added a noindex tag to a site page via custom code, the page will still be included in your site’s auto-generated sitemap (unless you toggle Sitemap indexing to off).

How to disable indexing of site pages with the Sitemap indexing toggle

If you disable indexing of a static site page with the Sitemap indexing toggle, that page will no longer be indexed by search engines and will no longer be included in your site’s sitemap. You can only disable indexing with the toggle if you’ve enabled your site’s auto-generated sitemap.

Note

The Sitemap indexing toggle adds <meta content="noindex" name="robots"> to your site page. This prevents the page from being crawled and indexed by search engines.

To prevent search engines from indexing certain site pages:

  1. Go to the page you want to prevent search engines from indexing
  2. Go to Page settings > SEO settings
  3. Toggle Sitemap indexing to off‍
  4. Publish your site

How to re-enable indexing of site pages with the Sitemap indexing toggle

To allow search engines to index certain site pages:

  1. Go to the page you want to allow search engines to index
  2. Go to Page settings > SEO settings
  3. Toggle Sitemap indexing to on‍
  4. Publish your site

How to generate a robots.txt file 

A robots.txt file instructs bots (also known as robots, spiders, or web crawlers) on how they should interact with your site. You can add rules to manage bot access to specific pages, folders, or your entire site. It's typically used to list pages or folders on your site that you don't want search engines to crawl or index.

Just like a sitemap, the robots.txt file lives in the top-level directory of your domain, e.g., yourdomain.com/robots.txt.

To generate a robots.txt file:

  1. Go to Site settings > SEO > Indexing
  2. Add the robots.txt rule(s) you want
  3. Click Save and publish your site

Important

Content from your site may still be indexed, even if it hasn’t been crawled. That happens when a search engine finds your content either because it was published previously, or there’s a link to that content from other content online. To ensure that a previously indexed page is not indexed, don’t add it in the robots.txt. Instead, use the Sitemap indexing toggle to remove that content from Google’s index. You can also use Google’s removals tool.

robots.txt rules

Each rule in your robots.txt file generally consists of two main parts:

  • User-agent: Identifies which bots the rule applies to (e.g., * for all bots, or Googlebot for Google's web crawler).
  • Disallow / Allow: Defines which of your site’s pages bots can or can't access. Disallow blocks access, while Allow explicitly permits access. Using / applies the rule to all pages on your site.

Note

Your site’s robots.txt file is publicly visible — avoid relying on it to protect sensitive or private content, as anyone can view which pages or folders you’re trying to restrict.

Example robots.txt rules

Block all bots from your entire site:

User-agent: *
Disallow: /

Block all bots from a specific page:

User-agent: *
Disallow: /specific-page

Block all bots from a specific folder (e.g., a Collection or set of related pages):

User-agent: *
Disallow: /folder-name/

Allow a specific bot and block all others from your entire site:

User-agent: Googlebot
Allow: /

User-agent: *
Disallow: /

Allow multiple specific bots and block all others from your entire site:

User-agent: Googlebot
Allow: /

User-agent: UptimeRobot
Allow: /

User-agent: *
Disallow: /

Allow all bots but disallow a specific bot from your entire site:

User-agent: BadBot
Disallow: /

User-agent: *
Allow: /

Block all bots from a particular page but allow it to access all other pages:

User-agent: Googlebot
Disallow: /specific-page
Allow: /

Important

Not all bots follow the rules specified in your robots.txt file, especially malicious or poorly configured ones. As a result, these bots may still access your site, including restricted folders and pages.

Include your sitemap:

Sitemap: https://your-site.com/sitemap.xml

Note

Webflow adds a link to your sitemap in your robots.txt by default.

Learn more about robots.txt.

Best practices for privacy 

If you’d like to prevent the discovery of a particular page or URL on your site, don’t use the robots.txt to disallow the URL from being crawled. Instead, use either of the following options: 

FAQ and troubleshooting tips

Can I use a robots.txt file to prevent my Webflow site assets from being indexed? 

It’s not possible to use a robots.txt file to prevent Webflow site assets from being indexed because a robots.txt file must live on the same domain as the content it applies to (in this case, where the assets are served). Webflow serves assets from our global CDN, rather than from the custom domain where the robots.txt file lives. Learn more about asset and file privacy in Webflow.

I removed the robots.txt file from my Site settings, but it still shows up on my published site. How can I fix this? 

Once the robots.txt has been created, it can’t be completely removed. However, you can replace it with new rules to allow the site to be crawled, e.g.: 

User-agent: *

Disallow: 

Make sure to save your changes and republish your site. If the issue persists and you still see the old robots.txt rules on your published site, please contact customer support.

How do I remove previously indexed pages from Google search?

You can use Google’s removals tool to remove previously indexed pages from Google search. Note that unless you also take steps to disable search engine indexing of these pages (e.g., using the Sitemap indexing toggle), they may be indexed by Google again.