If you are only want specific search bots and crawlers to access your website, and block others, then you can whitelist user agents for your website. Here’s how to whitelist user agent in Apache.


Here’s how to whitelist user agent in Apache.

Before proceeding, please ensure you have enabled .htaccess (mod_rewrite) in your Apache web server. Here are the steps to do it:


Place your .htaccess file in the root document folder of your website (/var/www/html)


1. Open .htaccess file

Open .htaccess file using a text editor. It is generally located at /var/www/html.

$ sudo vim /var/www/html/.htaccess


2. Whitelist User Agent

We will use the SetEnvIf and SetEnvIfNoCase to whitelist user agents.

If you want to allow only a specific User agent such as wget, add the following code to your .htaccess file,

SetEnvIf User-Agent .*Wget* wget
Order deny,allow
Deny from all
Allow from env=wget


If you want to allow only a number of popular search user agents such as google, yahoo, etc, add the following code

SetEnvIfNoCase User-Agent .*google.* search_robot
SetEnvIfNoCase User-Agent .*yahoo.* search_robot

Order Deny,Allow
Deny from All
Allow from env=search_robot


You can also use RewriteCond to check user agent and allow traffic. Replace myuseragent with what you require. F

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} !=myuseragent
RewriteRule .* - [F,L]


For example, to allow only browser-based, non-bot traffic,

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} !(Mozilla)
RewriteRule .* - [F,L]


Please note: If you want to block access to specific directories or URL then place the above code in a Directory block or place the .htaccess file in the required directory.


3. Restart Apache web server

Restart Apache web server to apply changes

$ sudo /etc/init.d/apache2 start [Debian or Ubuntu]
# sudo apachectl restart [RHEL, CentOS or Fedora]


That’s it! You can access your website or specific directories via localhost.


