The professional way to create a robots.txt file

The professional way to create a robots.txt file

Robots.txt is a text file that allows a website to provide instructions to web-crawling bots.

Search engines like Google use these web crawlers, sometimes called web bots, to archive and rank websites. Most robots are configured to look for a robots.txt file on the server before it reads any other file from a website. It does this to see if the website owner has some special instructions on how to crawl and index their site.

The robots.txt file contains a set of instructions that ask the robot to ignore certain files or directories. This may be for privacy purpose or because the website owner believes that the contents of those files and directories have nothing to do with the website’s ranking in search engines.

If a website has more than one subdomain, then each subdomain must have its own robots.txt file. It is important to note that not all robots will respect the robots.txt file. Some malicious bots will even read the robots.txt file to find which files and directories to target first. Also, even if the robots.txt file instructs the robots to ignore certain pages on the site, those pages may still appear in search results that other crawled pages link to.

How to Optimize your Robots.txt File in WordPress for SEO

One of our readers recently asked us for tips on how to optimize your robots.txt file to improve your SEO.

Robots.txt file tells search engines how to crawl your website making it an incredibly powerful SEO tool.

In this article, we will show you how to create a perfect robots.txt file for SEO.

What is a robots.txt file?

Robots.txt is a text file that website owners can create to tell search engine robots how to crawl and index pages on their sites.

It is usually stored in the root directory, also known as the home folder, of your website. The basic format of a robots.txt file looks like this:User-agent: [user-agent name]Disallow: [URL string not to be crawled] User-agent: [user-agent name]Allow: [URL string to be crawled] Sitemap: [URL of your XML Sitemap]

You can have multiple lines of instructions to allow or disallow specific URLs and add multiple sitemaps. If you don’t block the URL, the search engine bots will assume that they are allowed to crawl it.

This is what a robots.txt file would look like as an example:User-Agent: *Allow: /wp-content/uploads/Disallow: /wp-content/plugins/Disallow: /wp-admin/ Sitemap: https://example.com/sitemap_index.xml

In the robots.txt file example above, we allowed search engines to crawl and index files in the WordPress uploads folder.

Next, we prevented search bots from crawling and indexing plugins and WordPress admin folders.

Finally, we provided the URL of our XML sitemap.

Do you need a Robots.txt file for your WordPress site?

If you don’t have a robots.txt file, search engines will continue to crawl and index your website. However, you won’t be able to tell search engines which pages or folders they shouldn’t crawl.

This won’t have much of an impact when you’re starting a blog and it doesn’t have a lot of content.

However, as your website grows and you have a lot of content, you will likely want to have better control over how your website is crawled and indexed.

Here’s why.

Search bots have a crawl quota for each website.

This means that they crawl a certain number of pages during a crawl session. If they don’t finish crawling all the pages on your site, they will come back and resume crawling in the next session.

This can slow down your website’s indexing rate.

You can fix this by not allowing search bots to try to crawl unnecessary pages like WordPress admin pages, plugin files, and the theme folder.

By not allowing unnecessary pages, you save your crawl quota. This helps search engines crawl and index more pages on your site as quickly as possible.

Another good reason to use robots.txt is when you want to prevent search engines from indexing a post or page on your website.

It’s not the safest way to hide content from the general public, but it will help you prevent them from showing up in search results.

What does a perfect Robots.txt file look like?

Many popular blogs use a very simple robots.txt file. Their content may vary depending on the needs of the specific site:

User-agent: *
Disallow:
  
Sitemap: http://www.example.com/post-sitemap.xml
Sitemap: http://www.example.com/page-sitema

The robots.txt file allows all the bots to index all the content and provides them with a link to a website’s XML sitemaps.

For WordPress sites, we recommend the following rules in the robots.txt file:

User-Agent: *
Allow: /wp-content/uploads/
Disallow: /wp-content/plugins/
Disallow: /wp-admin/
Disallow: /readme.html
Disallow: /refer/
 
Sitemap: http://www.example.com/post-sitemap.xml
Sitemap: http://www.example.com/page-sitemap.xml

This tells search bots to index all WordPress images and files. Search bots are not allowed to index WordPress plugins, WordPress admin area, WordPress readme, and affiliate links.

By adding sitemaps to your robots.txt file, you make it easier for Google robots to find all the pages on your site.

Now that you know what an ideal robots.txt file looks like, let’s take a look at how to create a robots.txt file in WordPress.

How to create a Robots.txt file in WordPress?

There are two ways to create a robots.txt file in WordPress. You can choose the method that suits you.

Method 1: Edit Robots.txt File with All in One SEO

All in One SEO  also known as AIOSEO is the best WordPress SEO plugin on the market used by more than two million websites.

It is easy to use and comes with a robots.txt file generator.

Once the plugin is installed and activated, you can use it to create and edit a robots.txt file directly from your WordPress admin area.

Just go to  All in One SEO » Tools  to edit your robots.txt file.

First, you’ll need to turn on the Edit option, by clicking on “Enable Custom Robots.txt File” to switch it to blue.

With this switch, you can create a custom robots.txt file in WordPress.

All in One SEO will display the current robots.txt file in the “Robots.txt file preview” section at the bottom of the screen.

This version will display the default rules added by WordPress.

All in One SEO ملف robots.txt

These default rules tell search engines not to crawl your WordPress core files, allow bots to index all content, and provide them with a link to your site’s XML sitemaps.

Now, you can add your own custom rules to optimize your robots.txt for SEO.

To add a rule, enter the user agent in the “User agent” field. Using * will apply the rule to all user agents.

Then decide if you want to “allow” or “disallow” search engines to crawl.

Next, enter the file name or directory path in the “Directory path” field.

The rule will be automatically applied to your robots.txt file. To add another rule, click on the Add Rule button.

We recommend adding rules until you’ve created the perfect robots.txt format that we shared above.

Your custom rules will look like this.

Once you’re done, don’t forget to click the Save Changes button to store your changes.

Method 2: Edit the Robots.txt file manually using FTP

For this method, you will need to use an FTP client to edit the robots.txt file.

Simply connect to your WordPress hosting account using an FTP client.

Once in, you will be able to see the robots.txt file in the root folder of your website.

Robots.txt is a plain text file, which means you can download it to your computer and edit it with any plain text editor like Notepad or TextEdit.

After saving the changes, you can upload them back to the root folder of your website.

How do you test your Robots.txt file?

Once you have created your robots.txt file, it is always a good idea to test it with the robots.txt test tool.

There are many robots.txt testing tools out there, but we recommend the one inside  Google Search Console  .

First, you will need to connect your website to Google Search Console. If you haven’t done so yet, check out our guide on how to add your WordPress site to Google Search Console.

After that, you can use the Google Search Console Robots Test Tool.

Simply select your property from the dropdown list.

The tool will automatically fetch a robots.txt file for your website and flag errors and warnings if it finds any.

Google Search Console Robots Test Tool

last thoughts

The goal of robots.txt optimization is to prevent search engines from crawling pages that are not publicly available. For example, pages in your wp-plugins folder or pages in your WordPress admin folder.

A common myth among SEO experts is that blocking WordPress category, tags, and archive pages will improve crawl rate and lead to faster indexing and higher rankings.

this is not true. It’s also against Google’s Webmaster Guidelines.

We recommend following the robots.txt format above to create a robots.txt file for your website.