How to block BaiduSpider bot User-Agent

The Baidu spider (BaiduSpider user agent) can be a real pain to block, especially since it does not respect a robots.txt as it should. The following IIS URL Rewrite snippet blocks the Baidu spider based on its User-Agent string.

Normally you would block a bot or spider using the following robots.txt:

User-agent: BaiduSpider
Disallow: /

or

User-agent: *
Disallow: /

This doesn’t work for Baidu…

You can use the following IIS URL Rewrite rule to block the BaiduSpider User-Agent on your website. The only access allowed is to robots.txt, all other requests are blocked with a 403 Access Denied.

Expand the pattern= with multiple user agent strings, divided by a pipe (|), to block more bots. For example pattern="Baiduspider|Bing" or pattern="Googlebot|Bing".

Hint, search IIS URL Rewrite related posts on Saotn.org!

<!--
  Block BaiduSpider
-->
<rule name="block_BaiduSpider" stopProcessing="true">
  <match url="(.*)" />
  <conditions trackAllCaptures="true">
   <add input="{HTTP_USER_AGENT}" pattern="Baiduspider" negate="false" ignoreCase="true" />
   <add input="{URL}" pattern="^/robots\.txt" negate="true" ignoreCase="true" />
  </conditions>
  <action type="CustomResponse"
   statusCode="403"
   statusReason="Forbidden: Access is denied."
   statusDescription="Access is denied!" />
</rule>

Verifying the rewrite rule to block Baidu #

Using Fiddler‘s Composer option, to compose an HTTP request, you can easily verify the rewrite rule, as shown in the next two images.

I thought you might find this interesting:   Forfiles: How to delete files recursively on Windows Server
Verifying BaiduSpider is blocked with Fiddler
Verifying BaiduSpider is blocked with Fiddler: request
Verifying BaiduSpider is blocked with Fiddler
Verifying BaiduSpider is blocked with Fiddler; response

Please Support Saotn.org

Each post on Sysadmins of the North takes a significant amount of time to research, write, and edit. Therefore, your donation helps a lot! For example, a donation of $3 U.S. buys me a cup of coffee, and as you know: things jsut work better with coffee. A $10 U.S. donation buys me one month of web hosting (yes, hosting costs money). But seriously, thank you for any amount. Much appreciated!

Please donate to support this site if you found a post interesting or if it helped you solve a problem. Thanks! (Tip: no Paypal account required)

If you appreciated this post, then please donate using this Paypal button


Jan Reilink

My name is Jan. I am not a hacker, coder, developer, programmer or guru. I am merely a system administrator, doing my daily thing at Vevida in the Netherlands. With over 15 years of experience, my specialties include Windows Server, IIS, Linux (CentOS, Debian), security, PHP, websites & optimization.

Leave a Reply

1 Comment on "How to block BaiduSpider bot User-Agent"

Hi! Join the discussion, leave a reply!

Sort by:   newest | oldest | most voted