Chat with us, powered by LiveChat

This website uses cookies

Our website, platform and/or any sub domains use cookies to understand how you use our services, and to improve both your experience and our marketing relevance.

How to Add Robots.txt File for WordPress

June 21, 2019

6 Min Read
wordpress robots.txt
Reading Time: 6 minutes

Just creating a website is not enough. Getting listed in the search engines is the essential goal of all website owners so that a website becomes visible in SERP for certain keywords. This listing of a website and visibility of freshest content is mainly due to search engine robots that crawl and index websites. Webmasters can control the way in which these robots parse websites by inserting instructions in a special file called robots.txt.

In this article, I’ll tell how to set up a WordPress robots.txt file for the best website SEO. Note that several pages of a WordPress website need not be indexed by the search engines.

What Is a Robots.txt File?

A robots.txt is a text file located at the root of your website that tells search engine crawlers not to crawl parts of your website. It is also known as the Robots Exclusion Protocol that prevents search engines from indexing certain useless and/or specific contents (e.g. your login page and sensitive files).

In short, robots.txt tells search engine bots what they should not crawl on your website.

Here is how it works! When a search engine bot is about to crawl a URL of your website (that is, it will crawl and retrieve information so it can be indexed), it will first look for your file robots.txt.

wordpress with robots txt

Why Create Robots.txt for WordPress?

You usually don’t need to add the robots.txt file for WordPress websites. Search engines index the entire WordPress sites by default. However, for better SEO, you can add a robots.txt file to your root directory to specifically disallow search engines to access specific areas of your WordPress website.

IdeaBox – Case Study

Read how Cloudways Helped a WordPress Agency Build Better Products.

Thank You

Your Ebook is on its Way to Your Inbox.

How to Create Robots.txt for WordPress?

Log in to your web hosting dashboard. In my example, I am using Cloudways – Managed Cloud Hosting platform.

Go to the Servers tab from the top menu bar and get your SSH/SFTP access from Server Management → Master Credentials.

wordpress server access

Use any FTP server application to access your WordPress database files. I am using FileZilla for this tutorial. Launch it and connect to your server by using Master Credentials.

wordpress ftp access

Once connected, go to /applications folder of your WordPress database files. You will see different folders there.

wordpress database files

Now go back to the Cloudways Platform and from the top left bar, go to Applications. Select the application that you want to add the robots.txt file for:

access wordpress application

From the left pane, go to Application Management → Application Settings → General. You will find the folder name of your application.

wordpress application folder

Go back to FileZilla and then navigate to /applications/[FOLDER NAME]/public_html. Create a new text file here and name it robots.txt.

wordpress robots txt file

Right click on the robots.txt file, and click View/Edit to open it in a text editor (Notepad is a handy option).

edit wordpress robots txt file

Advanced Robots.txt for WordPress

Search engines like Google and Bing support the use of wildcards in the robots.txt file. These wildcards can be used to allow/disallow specific file types throughout the WordPress website.

An asterisk (*) can be used to handle a wide range of options/selections.

User-agent: *
Disallow : /images/image*.jpg

Here, “*” means that all images starting with “image” and with “jpg” extension will not be indexed by search engines. Here are a few WordPress robots.txt examples.

Example: image1.jpg, image2.jpg, imagexyz.jpg will not be indexed by the search engines.

The power of * is not limited to images only. You can even disallow all files with a particular extension.

User-agent: *
Disallow: /downloads/*.pdf
Disallow: /downloads/*.png

The above statements will ask all search engines to disallow all files with extensions “pdf” & “png” found in the downloads folder.

You can even disallow WordPress core directories by using *.

User-agent: *
Disallow: /wp-*/

The above line asks search engines not to crawl directories starting with “wp-”.

Example: wp-includes, wp-content, etc will not be indexed by search engines.

Another wildcard symbol used in WordPress robots.txt file is the dollar symbol ($).

User-agent: *
Disallow: referral.php

The above statement will ask search engines not to index referral.php and also referral.php?id=123 and so on.

But what if you want to block referral.php only?  You only have to include $ symbol just after the referral.php.

The symbol $ ensures that only referral.php is blocked but not referral.php?id=123.

User-agent: *
Disallow: referral.php$

You can use $ for directories too.

User-agent: *
Disallow: /wp-content/

This will instruct search engines to disallow wp-content folder plus all directories that are located inside wp-content. If you want to disallow only wp-content rather than all sub-folders, you should use the $ symbol. For example:

User-agent: *
Disallow: /wp-content/$

The $ symbol ensures that only wp-content is disallowed. All the directories in this folder are still accessible.

Below is the robots.txt file for Cloudways blog.

User-agent: *
Disallow: /admin/
Disallow: /admin/*?*
Disallow: /admin/*?
Disallow: /blog/*?*
Disallow: /blog/*?

The first line indicates the User-agent. This refers to the search engine that is allowed to access and index the website. A complete list of all search engine bots is available here.

User-agent: *

Where * means all search engines. You can specify each search engine separately.

Disallow: /admin/
Disallow: /admin/*?*
Disallow: /admin/*?

This will not allow search engines to crawl the “admin” directory. It is often not necessary for search engines to index these directories.

Disallow: /blog/*?*
Disallow: /blog/*?

If your WordPress site is a blogging site, it is the best practice to restrict search engine bots to not crawl your search queries.

If your site has a sitemap. Adding its URL helps search engine bots in finding the sitemap file. This results in faster indexing of pages.

sitemap: http://www.yoursite.com/sitemap.xml

What to Include in Robots.txt for WordPress?

You decide which parts of the WordPress site you wish to be included in SERP. Everyone has their own views on setting WordPress robots.txt file. Some recommend not to add a robots.txt file in WordPress. While in my opinion one should add and disallow /wp-admin/ folder. Robots.txt file is public. You can find a robots.txt file of any website by visiting www.example.com/robots.txt.

We’re done with a robots.txt file in WordPress. If you have any query about setting robots.txt file, feel free to ask in the comment section below.

Wrapping up!

As you can see, the file robots.txt is an interesting tool for your SEO. It makes it possible to point out to search engine robots what to index, and what not to index. But it must be handled with care. A bad configuration can lead to a total deindexation of your website (example: if you use Disallow: /). So, be careful!

Now it’s your turn. Tell me if you use this type of file and how you configure it. Share me your comments and feedback in the comments.

Q1. What is robots.txt?

The robots.txt is a text file placed at the root of your website. This file is intended to prohibit search engine robots from indexing certain areas of your website. The robots.txt file is one of the first files scanned by spiders (robots).

Q2. Why a robots.txt file is used?

The robots.txt file gives instructions to the search engine robots that analyze your website, it’s an exclusion protocol for robots. Thanks to this file, you can prohibit the exploration and indexing of your site to some robots (also called “crawlers” or “spiders”).

Share your opinion in the comment section. COMMENT NOW

Share This Article

Start Growing with Cloudways Today!

We never compromise on performance, security, and support.

Mustaasam Saleem

Mustaasam is the WordPress Community Manager at Cloudways - A Managed WordPress Hosting Platform, where he actively works and loves sharing his knowledge with the WordPress Community. When he is not working, you can find him playing squash with his friends, or defending in Football, and listening to music. You can email him at mustaasam.saleem@cloudways.com

Get Our Newsletter
Be the first to get the latest updates and tutorials.

Do you like what you read?

Get the Latest Updates

Share Your Feedback

Please insert Content

Thank you for your feedback!

BFCM 2019