How to Use Robots.txt to Disallow all pages from search engines & robots


robots.txt disallow all

While it is important to make your website search-friendly, there are many instances when you don’t want search engines to crawl your website, especially if it contains confidential, sensitive information. Sure, you can keep all this behind a log in page but it might be prone to hacking. One of the easiest ways to protect your website from being discovered online is to block it from search robots. You can easily do it with the help of a simple text file named robots.txt. Use robots.txt to disallow all pages on your website from search bots and crawl spiders.

 

How to Use Robots.txt to Disallow all pages from search engines & robots

Robots.txt is a clever way to tell search engines and crawl spiders that you don’t want specific (or all) sections of your website to be accessed by them.

Place a blank text file, with file name ‘robots.txt’ at the root of your website. So, if your domain is www.example.com, then robots.txt should be accessible at http://www.example.com/robots.txt

Now open the file with a text editor and copy-paste the following lines in it.

User-agent: *
Disallow: /

The above lines will prevent all search engines from crawling and indexing all your web pages and files.

 

If you want to block only the Googlebot, then you can paste the following lines:

User-Agent: googlebot
Disallow: /

 

Here are a few of the top robot names:

  • Googlebot – Google.com
  • YandexBot – Yandex.ru
  • Bingbot – Bing.com

 

 

If you only want to block as specific sub folder (e.g /wp-admin ), then you can use the following directive:

User-Agent: *
Disallow: /wp-admin/

Every time a search bot crawls your site, it will first look for robots.txt. Once you have created your robots.txt, it will be fetched automatically, and search engines will crawl according to its directives.

For more information on robots.txt, you can check out www.robotstxt.org

 

Conclusion – Robots.txt to Disallow All

You can use Robots.txt to disallow all pages on your site. It is a powerful directive to prevent search engines bots from crawling your website. However, you have to be careful with it. As long as you include this directive in your robots.txt file, search engines will never crawl it and your website will remain undiscovered. However, if you change your mind and want your web pages to be crawled, all you need to do is remove the lines you’ve added. In this case, the next time this file is fetched, search engines will automatically crawl your website, and index it after a few days’ time.

About Sreeram Sreenivasan

Sreeram Sreenivasan is the Founder of Ubiq, a business dashboard & reporting platform for small & medium businesses. Ubiq makes it easy to build business dashboards & reports for your business. Try it for free today!