Robots.txt Generator
Generate a valid robots.txt file with custom User-Agent blocks, Allow/Disallow rules, crawl-delay, and sitemap URL.
The Robots.txt Generator is a free, browser-based tool that creates a valid robots.txt file without writing any code. Add one or more User-Agent blocks, define Allow and Disallow paths for each, set an optional crawl delay, and add your sitemap URL — then download the ready-to-upload file in seconds. All processing happens entirely in your browser.
Features
| Feature | Detail |
|---|---|
| Multiple User-Agent blocks | Add separate rule sets for different crawlers (e.g. *, Googlebot, Bingbot) |
| Allow paths | Specify paths crawlers are explicitly permitted to access |
| Disallow paths | Block specific paths or directories from crawling |
| Crawl-delay | Set an optional global crawl delay (in seconds) for all bots |
| Sitemap URL | Append your sitemap location to help search engines discover your pages |
| Add / remove blocks | Dynamically add new User-Agent blocks or remove existing ones |
| Stats bar | Shows block count, total rule count, and file size after generation |
| Copy to clipboard | Copy the full output with one click |
Download .txt |
Save the result as robots.txt directly from the browser |
| Default template | Starts with a sensible * block with /admin/ and /private/ disallowed |
How to Use
- The tool starts with one default User-Agent block for
*(all crawlers). Edit it or add new blocks as needed. - In each block, enter the User-Agent name — use
*to target all robots, or a specific name likeGooglebot. - List Allow paths (one per line) for paths you explicitly want to permit.
- List Disallow paths (one per line) for paths you want to block.
- Optionally set a Crawl-delay value (in seconds) to throttle bot requests.
- Optionally enter your Sitemap URL so search engines can find your XML sitemap.
- Click Generate robots.txt.
- Review the output, then click Download .txt to save the file.
- Upload
robots.txtto the root of your website so it is accessible athttps://yourdomain.com/robots.txt.
robots.txt Format
A robots.txt file consists of one or more groups, each starting with one or more User-agent directives followed by Allow and Disallow rules:
| Directive | Description |
|---|---|
User-agent |
Specifies which crawler the following rules apply to. Use * for all bots. |
Allow |
Explicitly permits access to a path, even if a parent directory is disallowed |
Disallow |
Blocks access to a path or directory |
Crawl-delay |
Tells the bot to wait N seconds between requests (not supported by Google) |
Sitemap |
Points to the location of your XML sitemap |
Targeting Specific Crawlers
You can add multiple blocks to apply different rules to different bots:
User-agent: *
Disallow: /admin/
Disallow: /private/
User-agent: Googlebot
Allow: /
Crawl-delay: 5
User-agent: Bingbot
Disallow: /
Common crawler names include Googlebot, Bingbot, Slurp (Yahoo), DuckDuckBot, Baiduspider, facebookexternalhit, Twitterbot, and AhrefsBot.
Path Matching Rules
- A path must start with
/. - A trailing
/matches a directory and all its contents:/admin/blocks everything under/admin/. - An asterisk
*within a path matches any sequence of characters:/search?*blocks all search query URLs. - A
$at the end anchors the match to the end of the URL:/page$matches only/page, not/pages. - Allow rules take precedence over Disallow rules when both match the same path.
Use Cases
Blocking Admin and Private Areas
Use Disallow: /admin/ and Disallow: /private/ under User-agent: * to prevent all bots from crawling sensitive sections of your website.
Preventing Crawling of Development Environments
Block all crawlers on a staging or development site using Disallow: / to keep the site out of search engine indices.
Optimising Crawl Budget
Large websites can guide search engines to focus on important content by disallowing low-value pages such as search results pages, filtered category pages, or session-based URLs.
Allowing Specific Bots Past a General Block
Disallow all bots with User-agent: * / Disallow: /, then add a separate block for trusted crawlers with Allow: / to permit selective access.
Pointing Crawlers to Your Sitemap
Always include a Sitemap: directive pointing to your XML sitemap. This ensures search engines can discover all your important pages even if they are not linked prominently.
Frequently Asked Questions
Where should I upload robots.txt?
Place the file in the root directory of your website so it is accessible at https://yourdomain.com/robots.txt. There can only be one robots.txt per domain.
Does robots.txt prevent a page from appearing in search results?
No. robots.txt prevents crawlers from visiting a page, but if the page is linked from another site, search engines may still index it without visiting it. To prevent a page from appearing in search results, use a noindex meta tag or HTTP header instead.
What is the difference between Allow and Disallow?
Disallow blocks access to a path. Allow is used to grant access to a specific path when a parent directory is disallowed. Allow takes precedence over Disallow when both match.
Does Google respect Crawl-delay?
Google does not support the Crawl-delay directive. To control Googlebot crawl rate, use the crawl rate settings in Google Search Console instead.
Can I use wildcards in paths?
Yes. Google and most modern crawlers support * (match any sequence) and $ (end of URL) wildcards in paths. Example: Disallow: /*.pdf$ blocks all PDF files.
What happens if robots.txt is missing?
If no robots.txt file exists, crawlers assume there are no restrictions and will crawl the entire site.