Skip to main content

How to block BaiduSpider bot User-Agent

The Baidu spider (BaiduSpider user agent) can be a real pain to block, especially since it does not respect a robots.txt as it should. The following IIS URL Rewrite snippet blocks the Baidu spider based on its User-Agent string.

Normally you would block a bot or spider using the following robots.txt:

User-agent: BaiduSpider
Disallow: /

or

User-agent: *
Disallow: /

This doesn’t work for Baidu…

You can use the following IIS URL Rewrite rule to block the BaiduSpider User-Agent on your website. The only access allowed is to robots.txt, all other requests are blocked with a 403 Access Denied.

Expand the pattern= with multiple user agent strings, divided by a pipe (|), to block more bots. For example pattern="Baiduspider|Bing" or pattern="Googlebot|Bing".

Hint, search IIS URL Rewrite related posts on Saotn.org!

<!--
  Block BaiduSpider
-->
<rule name="block_BaiduSpider" stopProcessing="true">
  <match url="(.*)" />
  <conditions trackAllCaptures="true">
   <add input="{HTTP_USER_AGENT}" pattern="Baiduspider" negate="false" ignoreCase="true" />
   <add input="{URL}" pattern="^/robots\.txt" negate="true" ignoreCase="true" />
  </conditions>
  <action type="CustomResponse"
   statusCode="403"
   statusReason="Forbidden: Access is denied."
   statusDescription="Access is denied!" />
</rule>

Verifying the rewrite rule to block Baidu

Using Fiddler‘s Composer option, to compose an HTTP request, you can easily verify the rewrite rule, as shown in the next two images.

Verifying BaiduSpider is blocked with Fiddler
Verifying BaiduSpider is blocked with Fiddler: request
Verifying BaiduSpider is blocked with Fiddler
Verifying BaiduSpider is blocked with Fiddler; response

I thought you might find this interesting:   Create your own CDN using IIS Outbound Rules

Jan Reilink

My name is Jan. I am not a hacker, coder, developer, programmer or guru. I am merely a system administrator, doing my daily thing at Vevida in the Netherlands. With over 15 years of experience, my specialties include Windows Server, IIS, Linux (CentOS, Debian), security, PHP, websites & optimization.

One thought to “How to block BaiduSpider bot User-Agent”

Hi! Join the discussion, leave a reply!