Robots.txt Generator

Generate a valid robots.txt file with custom User-Agent blocks, Allow/Disallow rules, crawl-delay, and sitemap URL.

robots.txt robots seo crawler disallow allow user-agent sitemap bot
Free Client-Side Private
🔒 This tool runs entirely in your browser — your files are never uploaded to any server.

The Robots.txt Generator is a free, browser-based tool that creates a valid robots.txt file without writing any code. Add one or more User-Agent blocks, define Allow and Disallow paths for each, set an optional crawl delay, and add your sitemap URL — then download the ready-to-upload file in seconds. All processing happens entirely in your browser.

Features

Feature Detail
Multiple User-Agent blocks Add separate rule sets for different crawlers (e.g. *, Googlebot, Bingbot)
Allow paths Specify paths crawlers are explicitly permitted to access
Disallow paths Block specific paths or directories from crawling
Crawl-delay Set an optional global crawl delay (in seconds) for all bots
Sitemap URL Append your sitemap location to help search engines discover your pages
Add / remove blocks Dynamically add new User-Agent blocks or remove existing ones
Stats bar Shows block count, total rule count, and file size after generation
Copy to clipboard Copy the full output with one click
Download .txt Save the result as robots.txt directly from the browser
Default template Starts with a sensible * block with /admin/ and /private/ disallowed

How to Use

  1. The tool starts with one default User-Agent block for * (all crawlers). Edit it or add new blocks as needed.
  2. In each block, enter the User-Agent name — use * to target all robots, or a specific name like Googlebot.
  3. List Allow paths (one per line) for paths you explicitly want to permit.
  4. List Disallow paths (one per line) for paths you want to block.
  5. Optionally set a Crawl-delay value (in seconds) to throttle bot requests.
  6. Optionally enter your Sitemap URL so search engines can find your XML sitemap.
  7. Click Generate robots.txt.
  8. Review the output, then click Download .txt to save the file.
  9. Upload robots.txt to the root of your website so it is accessible at https://yourdomain.com/robots.txt.

robots.txt Format

A robots.txt file consists of one or more groups, each starting with one or more User-agent directives followed by Allow and Disallow rules:

Directive Description
User-agent Specifies which crawler the following rules apply to. Use * for all bots.
Allow Explicitly permits access to a path, even if a parent directory is disallowed
Disallow Blocks access to a path or directory
Crawl-delay Tells the bot to wait N seconds between requests (not supported by Google)
Sitemap Points to the location of your XML sitemap

Targeting Specific Crawlers

You can add multiple blocks to apply different rules to different bots:

User-agent: *
Disallow: /admin/
Disallow: /private/

User-agent: Googlebot
Allow: /
Crawl-delay: 5

User-agent: Bingbot
Disallow: /

Common crawler names include Googlebot, Bingbot, Slurp (Yahoo), DuckDuckBot, Baiduspider, facebookexternalhit, Twitterbot, and AhrefsBot.

Path Matching Rules

  • A path must start with /.
  • A trailing / matches a directory and all its contents: /admin/ blocks everything under /admin/.
  • An asterisk * within a path matches any sequence of characters: /search?* blocks all search query URLs.
  • A $ at the end anchors the match to the end of the URL: /page$ matches only /page, not /pages.
  • Allow rules take precedence over Disallow rules when both match the same path.

Use Cases

Blocking Admin and Private Areas

Use Disallow: /admin/ and Disallow: /private/ under User-agent: * to prevent all bots from crawling sensitive sections of your website.

Preventing Crawling of Development Environments

Block all crawlers on a staging or development site using Disallow: / to keep the site out of search engine indices.

Optimising Crawl Budget

Large websites can guide search engines to focus on important content by disallowing low-value pages such as search results pages, filtered category pages, or session-based URLs.

Allowing Specific Bots Past a General Block

Disallow all bots with User-agent: * / Disallow: /, then add a separate block for trusted crawlers with Allow: / to permit selective access.

Pointing Crawlers to Your Sitemap

Always include a Sitemap: directive pointing to your XML sitemap. This ensures search engines can discover all your important pages even if they are not linked prominently.

Frequently Asked Questions

Where should I upload robots.txt?

Place the file in the root directory of your website so it is accessible at https://yourdomain.com/robots.txt. There can only be one robots.txt per domain.

Does robots.txt prevent a page from appearing in search results?

No. robots.txt prevents crawlers from visiting a page, but if the page is linked from another site, search engines may still index it without visiting it. To prevent a page from appearing in search results, use a noindex meta tag or HTTP header instead.

What is the difference between Allow and Disallow?

Disallow blocks access to a path. Allow is used to grant access to a specific path when a parent directory is disallowed. Allow takes precedence over Disallow when both match.

Does Google respect Crawl-delay?

Google does not support the Crawl-delay directive. To control Googlebot crawl rate, use the crawl rate settings in Google Search Console instead.

Can I use wildcards in paths?

Yes. Google and most modern crawlers support * (match any sequence) and $ (end of URL) wildcards in paths. Example: Disallow: /*.pdf$ blocks all PDF files.

What happens if robots.txt is missing?

If no robots.txt file exists, crawlers assume there are no restrictions and will crawl the entire site.

Report an issue