You are here: Saotn.org » WordPress » Block WordPress comment spammers manually

Block WordPress comment spammers manually

The less spammers hit your WordPress blog, the better your blog performs, is one of my opinions. A second is, the less unnecessary plugins you use on your WordPress blog, the better. So, a little while ago I decided to remove plugins like Stop Spammer Registration Plugin and do its work myself. Here is why & how:

As long as Akismet catches the spam, I can block the IP addresses myself. Plus, I might be able to see some trends like IP ranges that spam a lot, new IP ranges, new spam templates being used, and so on. I like that :).

Block WordPress comment spammers manually

Blocking WordPress spammers manually may sound very time consuming, but it really isn’t. You mostly have to wait for the spam. Here on Saotn.org, Akismet catches about 98.53% of the spam, so I don’t have to mark a lot of comments as spam. Further, we automate a lot of tasks with MySQL and Notepad++ (which is my text editor of choice on Windows) or VIM on your Linux Bash shell.

There is more than one way to block comment spammers on your WordPress blog. As you might know, I host my website on the IIS web server platform. Therefor we need the Dynamic IP Address Restrictions module.

If you are on Apache, you can use mod_rewrite instead.

Find spammer IP addresses in MySQL database

All real live data here. To find IP addresses belonging to spammers in your database:

WordPress saves the comments in the table prefix_comments. We can easily use MySQL to list all IP addresses that match our requirements. Akismet uses the word spam in the column comment_approved.

Knowing that, our query becomes:

SELECT DISTINCT comment_author_IP FROM `wp_comments` WHERE `comment_approved` = 'spam';
Code language: SQL (Structured Query Language) (sql)

We use SELECT DISTINCT to list unique IP addresses. In my case, this lists the following IP addresses:

+---------------------------+ | comment_author_IP | +---------------------------+ | 37.115.188.27 | | 5.135.200.88 | | 173.213.97.191 | | 175.44.17.43 | | 64.120.171.172 | | 173.213.99.149 | | 190.200.17.249 | [...] | 213.184.99.7 | | 107.6.159.30 | | 178.216.54.220 | | 198.143.145.182 | | 173.232.105.133 | | 151.237.177.159 | | 2002:7180:2e5d::7180:2e5d | | 96.127.185.149 | | 5.157.45.154 | | 60.168.7.171 | | 50.2.194.250 | | 60.168.6.2 | [...] | 218.85.145.245 | | 5.135.200.91 | | 74.221.223.195 | | 59.60.123.156 | | 198.50.189.122 | +---------------------------+ 452 rows in set (0.00 sec)
Code language: SQL (Structured Query Language) (sql)

Did you notice the one IPv6 address? All IP addresses are presumed innocent until proven guilty :-) Look up their reputation at SenderBase.org.

Preparing the IP addresses for web.config with Notepad++

You might think it’s a lot of work to copy/paste each address into the web.config format:

<add ipAddress="aa.bbb.cc.ddd" allowed="false" />
Code language: HTML, XML (xml)

but it isn’t. Notepad++ has a handy search and replace function (shortcut key: CTRL H).

  1. first, find one space (“”) and replace ‘all’ with nothing. All spaces are gone;
  2. second, find the pipe (“|“) and replace ‘all’ with nothing. All pipes are gone too;
  3. third, use the Regular Expression option to execute the following search and replace:
    find what: ^(.*)$
    Replace with: <add ipAddress="\1" allowed="false" />
    The \1 is a back reference to the IP address found in (.*) and we have our web.config format ready!

Add IP addresses to Dynamic IP Address Restrictions web.config

The partial output of our Notepad++ actions above is:

<add ipAddress="107.6.159.30" allowed="false" /> <add ipAddress="108.163.221.85" allowed="false" /> <add ipAddress="108.163.247.19" allowed="false" /> <add ipAddress="108.163.248.68" allowed="false" /> <add ipAddress="108.163.248.70" allowed="false" /> <add ipAddress="108.177.194.114" allowed="false" /> <add ipAddress="108.178.5.100" allowed="false" /> ... ...
Code language: HTML, XML (xml)

We add this to our web.config file under <security> <ipSecurity>, just copy and paste. Now all those IP addresses are blocked and denied access to my blog.

Block IP addresses in Apache 2.4.6+ .htaccess module mod_authz_core

Apache 2.4.6+ uses a new module mod_authz_core for authorization and blocking. The Apache mod_authz_core documentation writes:

This module provides core authorization capabilities so that authenticated users can be allowed or denied access to portions of the web site. mod_authz_core provides the functionality to register various authorization providers. It is usually used in conjunction with an authentication provider module such as mod_authn_file and an authorization module such as mod_authz_user. It also allows for advanced logic to be applied to the authorization processing.

Apache Module mod_authz_core

Because it’s really different from the Apache 2.2 Access Control, it requires a syntax change (and mindset). WordPress plugin editors take note: the new syntax is:

<RequireAny> Require all granted # IP address to block Require not ip 198.51.100.25 # ... </RequireAny>
Code language: Apache (apache)

You’ll find more information in my post WordPress .htaccess security best practices in Apache 2.4.6+.

To create an Apache .htaccess blacklist use the following steps:

  • connect to your database – use SSL wherever available for MySQL connections: 
    • mysql --ssl-mode=PREFERRED -h db_hostname -u user_name -p --database=db_name, or
    • mysql --ssl-mode=REQUIRED -h db_hostname -u user_name -p --database=db_name
  • select all as spam marked comments:
    • SELECT DISTINCT comment_author_IP FROM wp_comments WHERE comment_approved = 'spam';
  • save the result in a file, and use Vim to transform it into a compatible .htaccess format:
    • :%s/ //g # remove spaces
    • :%s/|//g # remove pipes
    • :%s/^\(.*\)$/Require not ip \1/g # Apache 2.4.6+ .htaccess module mod_authz_core

Blocking IP addresses with Apache mod_rewrite

If your blog is on a Apache webserver, you can use mod_rewrite to block the IP addresses. The syntaxis is:

<IfModule mod_rewrite.c> RewriteEngine On RewriteCond %{REMOTE_ADDR} ^107\.6\.159\.30 [OR] RewriteCond %{REMOTE_ADDR} ^108\.163\.221\.85 [OR] RewriteCond %{REMOTE_ADDR} ^108\.163\.247\.19 [OR] ... ... RewriteRule ^(.*)$ - [F,L] </IfModule>
Code language: Apache (apache)

When, in time your web.config or .htaccess file grows (and becomes huge), it is necessary to start using rewrite maps:

And of course you can start blocking whole IP ranges. Don’t forget to delete the spam comments and meta data from your database afterwards.

Note: it’s better to use mod_authz_core Require All blocks like above than all these rewrite conditions.

Create your own local web blacklist of comment spammers in PHP

Use Project Honey Pot and Stop Forum Spam to block spammers on your website

Beside these steps above, it is also possible to automatically filter web traffic with one or multiple blacklists. Project Honey Pot for instance. The IP address of every visitor is looked up in the Project Honey Pot database. If it’s listed there, access to the website is denied. I’ve written a PHP implementation that works with both IIS web.config and Apache mod_rewrite:

Disable WordPress comments
Learn how to disable WordPress comments on individual posts or globally.

2 thoughts on “Block WordPress comment spammers manually”

  1. .toolbelt-social-share{clear:both;font-size:1rem;display:grid;gap:calc(var(–toolbelt-spacing)/ 4);grid-template-columns:repeat(auto-fit,minmax(11rem,1fr));margin-bottom:calc(var(–toolbelt-spacing) * 2);margin-top:calc(var(–toolbelt-spacing) * 2)}.toolbelt-social-share .toolbelt_share-api{display:none}.toolbelt-social-share a{padding:calc(var(–toolbelt-spacing)/ 4) var(–toolbelt-spacing);color:var(–toolbelt-color-light);align-items:center;display:flex;text-decoration:none}.toolbelt-social-share a:hover{color:var(–toolbelt-color-light)}.toolbelt-social-share a:hover span{text-decoration:underline}.toolbelt-social-share-api-enabled .toolbelt-social-share .toolbelt_share-api{display:inline}.toolbelt-social-share-api-enabled .toolbelt-social-share a{display:none}.toolbelt-social-share svg{-webkit-margin-end:calc(var(–toolbelt-spacing)/ 2);margin-inline-end:calc(var(–toolbelt-spacing)/ 2);height:1.5rem;width:1.5rem;vertical-align:middle}.toolbelt-social-share svg *{stroke:none;fill:currentColor}@media (min-width:500px){.toolbelt-social-share .toolbelt_whatsapp{display:none}}
    Jim Walker from HackRepair.com posted a 2016 version of his Bad Bots .htaccess on Pastebin. I offered Jim to translate his Bad Bots .htaccess to web.config, to be used with Windows Server IIS. And here it is, learn to protect your WordPress website with this web.config file!

    (adsbygoogle = window.adsbygoogle || []).push({});

    Bad Bots web.config for IIS

    Just put the following content in a new text file, save as web.config and upload it to your website. Note, your hosting provider may have denied access to some settings, but basically all you need is IIS URL Rewrite.

    Convert .htaccess to web.config easily with help from this postLearn how to filter web traffic with blacklistsBlock BaiduSpider bot User-Agent in web.config

    .wp-block-code {
    border: 0;
    padding: 0;
    }

    .wp-block-code > div {
    overflow: auto;
    }

    .shcb-language {
    border: 0;
    clip: rect(1px, 1px, 1px, 1px);
    -webkit-clip-path: inset(50%);
    clip-path: inset(50%);
    height: 1px;
    margin: -1px;
    overflow: hidden;
    padding: 0;
    position: absolute;
    width: 1px;
    word-wrap: normal;
    word-break: normal;
    }

    .hljs {
    box-sizing: border-box;
    }

    .hljs.shcb-code-table {
    display: table;
    width: 100%;
    }

    .hljs.shcb-code-table > .shcb-loc {
    color: inherit;
    display: table-row;
    width: 100%;
    }

    .hljs.shcb-code-table .shcb-loc > span {
    display: table-cell;
    }

    .wp-block-code code.hljs:not(.shcb-wrap-lines) {
    white-space: pre;
    }

    .wp-block-code code.hljs.shcb-wrap-lines {
    white-space: pre-wrap;
    }

    .hljs.shcb-line-numbers {
    border-spacing: 0;
    counter-reset: line;
    }

    .hljs.shcb-line-numbers > .shcb-loc {
    counter-increment: line;
    }

    .hljs.shcb-line-numbers .shcb-loc > span {
    padding-left: 0.75em;
    }

    .hljs.shcb-line-numbers .shcb-loc::before {
    border-right: 1px solid #ddd;
    content: counter(line);
    display: table-cell;
    padding: 0 0.75em;
    text-align: right;
    -webkit-user-select: none;
    -moz-user-select: none;
    -ms-user-select: none;
    user-select: none;
    white-space: nowrap;
    width: 1%;
    }
    <?xml version="1.0" encoding="UTF-8"?>
    <configuration>
    <system.webServer>
    <httpProtocol>
    <customHeaders>
    <!--
    # Security related. May help against some types of drive-by-downloads
    # Read more: https://www.owasp.org/index.php/List_of_useful_HTTP_headers
    -->
    <add name="X-Content-Type-Options" value="nosniff" />
    <!--
    # Security related. May help against some types of cross-site scripting attacks
    # Read more: https://www.owasp.org/index.php/List_of_useful_HTTP_headers
    -->
    <add name="X-XSS-Protection" value="1; mode=block" />
    </customHeaders>
    </httpProtocol>

    <rewrite>
    <rules>
    <rule name="Block Common Malicious Bot Queries" stopProcessing="true">
    <match url=".*" ignoreCase="false" />
    <conditions logicalGrouping="MatchAny">
    <add input="{QUERY_STRING}" pattern="http://www.google.com/humans.txt?" />
    <add input="{QUERY_STRING}" pattern="(img|thumb|thumb_editor|thumbopen).php" />
    <add input="{QUERY_STRING}" pattern="fckeditor" />
    <add input="{QUERY_STRING}" pattern="revslider" />
    </conditions>
    <action type="CustomResponse" statusCode="403" statusReason="Forbidden" statusDescription="Forbidden" />
    </rule>

    <rule name="Abuse User Agents Blocking" stopProcessing="true">
    <!--
    # Blocking user agents stops traffic from the named bots below
    # it matches any bot named below
    -->
    <match url=".*" ignoreCase="false" />
    <conditions logicalGrouping="MatchAny">
    <add input="{HTTP_USER_AGENT}" pattern="^.*(1Noonbot|1on1searchBot|3D_SEARCH|3DE_SEARCH2|3GSE|50.nu|192.comAgent|360Spider|A6-Indexer|AASP|ABACHOBot|Abonti|abot|AbotEmailSearch|Aboundex|AboutUsBot|AccMonitor Compliance|accoona|AChulkov.NET page walker|Acme.Spider|AcoonBot|acquia-crawler|ActiveTouristBot|Acunetix|Ad Muncher|AdamM|adbeat_bot|adminshop.com|Advanced Email|AESOP_com_SpiderMan|AESpider|AF Knowledge Now Verity|aggregator:Vocus|ah-ha.com|AhrefsBot|AIBOT|aiHitBot|aipbot|AISIID|AITCSRobot|Akamai-SiteSnapshot|AlexaWebSearchPlatform|AlexfDownload|Alexibot|AlkalineBOT|All Acronyms|Amfibibot|AmPmPPC.com|AMZNKAssocBot|Anemone|Anonymous|Anonymouse.org|AnotherBot|AnswerBot|AnswerBus|AnswerChase PROve|AntBot|antibot-|AntiSantyWorm|Antro.Net|AONDE-Spider|Aport|Aqua_Products|AraBot|Arachmo|Arachnophilia|archive.org_bot|aria eQualizer|arianna.libero.it|Arikus_Spider|Art-Online.com|ArtavisBot|Artera|ASpider|ASPSeek|asterias|AstroFind|athenusbot|AtlocalBot|Atomic_Email_Hunter|attach|attrakt|attributor|Attributor.comBot|augurfind|AURESYS|AutoBaron|autoemailspider|autowebdir|AVSearch-|axfeedsbot|Axonize-bot|Ayna|b2w|BackDoorBot|BackRub|BackStreet Browser|BackWeb|Baiduspider-video|Bandit|BatchFTP|baypup|BDFetch|BecomeBot|BecomeJPBot|BeetleBot|Bender|besserscheitern-crawl|betaBot|Big Brother|Big Data|Bigado.com|BigCliqueBot|Bigfoot|BIGLOTRON|Bilbo|BilgiBetaBot|BilgiBot|binlar|bintellibot|bitlybot|BitvoUserAgent|Bizbot003|BizBot04|BizBot04 kirk.overleaf.com|Black.Hole|Black Hole|Blackbird|BlackWidow|bladder fusion|Blaiz-Bee|BLEXBot|Blinkx|BlitzBOT|Blog Conversation Project|BlogMyWay|BlogPulseLive|BlogRefsBot|BlogScope|Blogslive|BloobyBot|BlowFish|BLT|bnf.fr_bot|BoaConstrictor|BoardReader-Image-Fetcher|BOI_crawl_00|BOIA-Scan-Agent|BOIA.ORG-Scan-Agent|boitho.com-dc|Bookmark Buddy|bosug|Bot Apoena|BotALot|BotRightHere|Botswana|bottybot|BpBot|BRAINTIME_SEARCH|BrokenLinkCheck.com|BrowserEmulator|BrowserMob|BruinBot|BSearchR&D|BSpider|btbot|Btsearch|Buddy|Buibui|BuildCMS|BuiltBotTough|Bullseye|bumblebee|BunnySlippers|BuscadorClarin|Butterfly|BuyHawaiiBot|BuzzBot|byindia|BySpider|byteserver|bzBot|c r a w l 3 r|CacheBlaster|CACTVS Chemistry|Caddbot|Cafi|Camcrawler|CamelStampede|Canon-WebRecord|Canon-WebRecordPro|CareerBot|casper|cataguru|CatchBot|CazoodleBot|CCBot|CCGCrawl|ccubee|CD-Preload|CE-Preload|Cegbfeieh|Cerberian Drtrs|CERT FigleafBot|cfetch|CFNetwork|Chameleon|ChangeDetection|Charlotte|Check&Get|Checkbot|Checklinks|checkprivacy|CheeseBot|ChemieDE-NodeBot|CherryPicker|CherryPickerElite|CherryPickerSE|Chilkat|ChinaClaw|CipinetBot|cis455crawler|citeseerxbot|cizilla.com|ClariaBot|clshttp|Clushbot|cmsworldmap|coccoc|CollapsarWEB|Collector|combine|comodo|conceptbot|ConnectSearch|conpilot|ContentSmartz|ContextAd|contype|cookieNET|CoolBott|CoolCheck|Copernic|Copier|CopyRightCheck|core-project|cosmos|Covario-IDS|Cowbot-|Cowdog|crabbyBot|crawl|Crawl_Application|crawl.UserAgent|CrawlConvera|crawler|crawler_for_infomine|CRAWLER-ALTSE.VUNET.ORG-Lynx|crawler-upgrade-config|crawler.kpricorn.org|crawler@|crawler4j|crawler43.ejupiter.com|Crawly|CreativeCommons|Crescent|Crescent Internet ToolPak HTTP OLE Control|cs-crawler|CSE HTML Validator|CSHttpClient|Cuasarbot|culsearch|Curl|Custo|Cutbot|cvaulev|Cyberdog|CyberNavi_WebGet|CyberSpyder|CydralSpider).*quot; />
    <add input="{HTTP_USER_AGENT}" pattern="^.*(D1GArabicEngine|DA|DataCha0s|DataFountains|DataparkSearch|DataSpearSpiderBot|DataSpider|Dattatec.com|Dattatec.com-Sitios-Top|Daumoa|DAUMOA-video|DAUMOA-web|Declumbot|Deepindex|deepnet|DeepTrawl|dejan|del.icio.us-thumbnails|DelvuBot|Deweb|DiaGem|Diamond|DiamondBot|diavol|DiBot|didaxusbot|DigExt|Digger|DiGi-RSSBot|DigitalArchivesBot|DigOut4U|DIIbot|Dillo|Dir_Snatch.exe|DISCo|DISCo Pump|discobot|DISCoFinder|Distilled-Reputation-Monitor|Dit|DittoSpyder|DjangoTraineeBot|DKIMRepBot|DoCoMo|DOF-Verify|domaincrawler|DomainScan|DomainWatcher|dotbot|DotSpotsBot|Dow Jonesbot|Download|Download Demon|Downloader|DOY|dragonfly|Drip|drone|DTAAgent|dtSearchSpider|dumbot|Dwaar|Dwaarbot|DXSeeker|EAH|EasouSpider|EasyDL|ebingbong|EC2LinkFinder|eCairn-Grabber|eCatch|eChooseBot|ecxi|EdisterBot|EduGovSearch|egothor|eidetica.com|EirGrabber|ElisaBot|EllerdaleBot|EMail Exractor|EmailCollector|EmailLeach|EmailSiphon|EmailWolf|EMPAS_ROBOT|EnaBot|endeca|EnigmaBot|Enswer Neuro|EntityCubeBot|EroCrawler|eStyleSearch|eSyndiCat|Eurosoft-Bot|Evaal|Eventware|Everest-Vulcan|Exabot|Exabot-Images|Exabot-Test|Exabot-XXX|ExaBotTest|ExactSearch|exactseek.com|exooba|Exploder|explorersearch|extract|Extractor|ExtractorPro|EyeNetIE|ez-robot|Ezooms|factbot|FairAd Client|falcon|Falconsbot|fast-search-engine|FAST Data Document|FAST ESP|fastbot|fastbot.de|FatBot|Favcollector|Faviconizer|FDM|FedContractorBot|feedfinder|FelixIDE|fembot|fetch_ici|Fetch API Request|fgcrawler|FHscan|fido|Filangy|FileHound|FindAnISP.com_ISP_Finder|findlinks|FindWeb|Firebat|Fish-Search-Robot|Flaming AttackBot|Flamingo_SearchEngine|FlashCapture|FlashGet|flicky|FlickySearchBot|flunky|focused_crawler|FollowSite|Foobot|Fooooo_Web_Video_Crawl|Fopper|FormulaFinderBot|Forschungsportal|fr_crawler|Francis|Freecrawl|FreshDownload|freshlinks.exe|FriendFeedBot|frodo.at|froGgle|FrontPage|Froola|FU-NBI|full_breadth_crawler|FunnelBack|FunWebProducts|FurlBot|g00g1e|G10-Bot|Gaisbot|GalaxyBot|gazz|gcreep|generate_infomine_category_classifiers|genevabot|genieBot|GenieBotRD_SmallCrawl|Genieo|Geomaxenginebot|geometabot|GeonaBot|GeoVisu|GermCrawler|GetHTMLContents|Getleft|GetRight|GetSmart|GetURL.rexx|GetWeb!|Giant|GigablastOpenSource|Gigabot|Girafabot|GleameBot|gnome-vfs|Go-Ahead-Got-It|Go!Zilla|GoForIt.com|GOFORITBOT|gold|Golem|GoodJelly|Gordon-College-Google-Mini|goroam|GoSeebot|gotit|Govbot|GPU p2p|grab|Grabber|GrabNet|Grafula|grapeFX|grapeshot|GrapeshotCrawler|grbot|GreenYogi [ZSEBOT]|Gromit|GroupMe|grub|grub-client|Grubclient-|GrubNG|GruBot|gsa|GSLFbot|GT::WWW|Gulliver|GulperBot|GurujiBot|GVC|GVC BUSINESS|gvcbot.com|HappyFunBot|harvest|HarvestMan|Hatena Antenna|Hawler|Hazel's Ferret hopper|hcat|hclsreport-crawler|HD nutch agent|Header_Test_Client|healia|Helix|heritrix|hijbul-heritrix-crawler|HiScan|HiSoftware AccMonitor|HiSoftware AccVerify|hitcrawler_|hivaBot|hloader|HMSEbot|HMView|hoge|holmes|HomePageSearch|Hooblybot-Image|HooWWWer|Hostcrawler|HSFT - Link|HSFT - LVU|HSlide|ht:|htdig|Html Link Validator|HTMLParser|HTTP::Lite|httplib|HTTrack|Huaweisymantecspider|hul-wax|humanlinks|HyperEstraier|Hyperix).*quot; />
    <add input="{HTTP_USER_AGENT}" pattern="^.*(ia_archiver|IAArchiver-|ibuena|iCab|ICDS-Ingestion|ichiro|iCopyright Conductor|id-search|IDBot|IEAutoDiscovery|IECheck|iHWebChecker|IIITBOT|iim_405|IlseBot|IlTrovatore|Iltrovatore-Setaccio|ImageBot|imagefortress|ImagesHereImagesThereImagesEverywhere|ImageVisu|imds_monitor|imo-google-robot-intelink|IncyWincy|Industry Cortexcrawler|Indy Library|indylabs_marius|InelaBot|Inet32 Ctrl|inetbot|InfoLink|INFOMINE|infomine.ucr.edu|InfoNaviRobot|Informant|Infoseek|InfoTekies|InfoUSABot|INGRID|Inktomi|InsightsCollector|InsightsWorksBot|InspireBot|InsumaScout|Intelix|InterGET|Internet Ninja|InternetLinkAgent|Interseek|IOI|ip-web-crawler.com|IPAdd|Ipselonbot|Iria|IRLbot|Iron33|Isara|iSearch|iSiloX|IsraeliSearch|IstellaBot|its-learning|IU_CSCI_B659_class_crawler|iVia|iVia Page Fetcher|JadynAve|JadynAveBot|jakarta|Jakarta Commons-HttpClient|Java|Jbot|JemmaTheTourist|JennyBot|Jetbot|JetBrains Omea Pro|JetCar|Jim|JoBo|JobSpider_BA|JOC|JoeDog|JoyScapeBot|JSpyda|JubiiRobot|jumpstation|Junut|JustView|Jyxobot|K.S.Bot|KakcleBot|kalooga|KaloogaBot|kanagawa|KATATUDO-Spider|Katipo|kbeta1|Kenjin.Spider|KeywenBot|Keyword.Density|Keyword Density|kinjabot|KIT-Fireball|Kitenga-crawler-bot|KiwiStatus|kmbot-|kmccrew|Knight|KnowItAll|Knowledge.com|Knowledge Engine|KoepaBot|Koninklijke|KrOWLer|KSbot|kuloko-bot|kulturarw3|KummHttp|Kurzor|Kyluka|L.webis|LabelGrab|Labhoo|labourunions411|lachesis|Lament|LamerExterminator|LapozzBot|larbin|LARBIN-EXPERIMENTAL|LBot|LeapTag|LeechFTP|LeechGet|LetsCrawl.com|LexiBot|LexxeBot|lftp|libcrawl|libiViaCore|libWeb|libwww|libwww-perl|likse|Linguee|Link|link_checker|LinkAlarm|linkbot|LinkCheck by Siteimprove.com|LinkChecker|linkdex.com|LinkextractorPro|LinkLint|linklooker|Linkman|LinkScan|LinksCrawler|LinksManager.com_bot|LinkSweeper|linkwalker|LiteFinder|LitlrBot|Little Grabber at Skanktale.com|Livelapbot|LM Harvester|LMQueueBot|LNSpiderguy|LoadTimeBot|LocalcomBot|locust|LolongBot|LookBot|Lsearch|lssbot|LWP|lwp-request|lwp-trivial|LWP::Simple|Lycos_Spider|Lydia Entity|LynnBot|Lytranslate|Mag-Net|Magnet|magpie-crawler|Magus|Mail.Ru|Mail.Ru_Bot|MAINSEEK_BOT|Mammoth|MarkWatch|MaSagool|masidani_bot_|Mass|Mata.Hari|Mata Hari|matentzn at cs dot man dot ac dot uk|maxamine.com--robot|maxamine.com-robot|maxomobot|Maxthon$|McBot|MediaFox|medrabbit|Megite|MemacBot|Memo|MendeleyBot|Mercator-|mercuryboard_user_agent_sql_injection.nasl|MerzScope|metacarta|Metager2|metager2-verification-bot|MetaGloss|METAGOPHER|metal|metaquerier.cs.uiuc.edu|METASpider|Metaspinner|MetaURI|MetaURI API|MFC_Tear_Sample|MFcrawler|MFHttpScan|Microsoft.URL|MIIxpc|miner|mini-robot|minibot|miniRank|Mirror|Missigua Locator|Mister.PiX|Mister PiX|Miva|MJ12bot|mnoGoSearch|mod_accessibility|moduna.com|moget|MojeekBot|MOMspider|MonkeyCrawl|MOSES|Motor|mowserbot|MQbot|MSE360|MSFrontPage|MSIECrawler|MSIndianWebcrawl|MSMOBOT|Msnbot|msnbot-products|MSNPTC|MSRBOT|MT-Soft|MultiText|My_Little_SearchEngine_Project|my-heritrix-crawler|MyApp|MYCOMPANYBOT|mycrawler|MyEngines-US-Bot|MyFamilyBot|Myra|nabot|nabot_|Najdi.si|Nambu|NAMEPROTECT|NatchCVS|naver|naverbookmarkcrawler|NaverBot|Navroad|NearSite|NEC-MeshExplorer|NeoScioCrawler|NerdByNature.Bot|NerdyBot|Nerima-crawl-).*quot; />
    <add input="{HTTP_USER_AGENT}" pattern="^.*(Nessus|NESSUS::SOAP|nestReader|Net::Trackback|NetAnts|NetCarta CyberPilot Pro|Netcraft|NetID.com|NetMechanic|Netprospector|NetResearchServer|NetScoop|NetSeer|NetShift=|NetSongBot|Netsparker|NetSpider|NetSrcherP|NetZip|NetZip-Downloader|NewMedhunt|news|News_Search_App|NewsGatherer|Newsgroupreporter|NewsTroveBot|NextGenSearchBot|nextthing.org|NG|NHSEWalker|nicebot|NICErsPRO|niki-bot|NimbleCrawler|nimbus-1|ninetowns|Ninja|NjuiceBot|NLese|Nogate|Nomad-V2.x|NoteworthyBot|NPbot|NPBot-|NRCan intranet|NSDL_Search_Bot|nu_tch-princeton|nuggetize.com|nutch|nutch1|NutchCVS|NutchOrg|NWSpider|Nymesis|nys-crawler|ObjectsSearch|oBot|Obvius external linkcheck|Occam|Ocelli|Octopus|ODP entries|Offline.Explorer|Offline Explorer|Offline Navigator|OGspider|OmiExplorer_Bot|OmniExplorer_Bot|omnifind|OmniWeb|OnetSzukaj|online link validator|OOZBOT|Openbot|Openfind|Openfind data|OpenHoseBot|OpenIntelligenceData|OpenISearch|OpenSearchServer_Bot|OpiDig|optidiscover|OrangeBot|ORISBot|ornl_crawler_1|ORNL_Mercury|osis-project.jp|OsO|OutfoxBot|OutfoxMelonBot|OWLER-BOT|owsBot|ozelot|P3P Client|page_verifier|PageBitesHyperBot|Pagebull|PageDown|PageFetcher|PageGrabber|PagePeeker|PageRank Monitor|pamsnbot.htm|Panopy|panscient.com|Pansophica|Papa Foto|PaperLiBot|parasite|parsijoo|Pathtraq|Pattern|Patwebbot|pavuk|PaxleFramework|PBBOT|pcBrowser|pd-crawler|PECL::HTTP|penthesila|PeoplePal|perform_crawl|PerMan|PGP-KA|PHPCrawl|PhpDig|PicoSearch|pipBot|pipeLiner|Pita|pixfinder|PiyushBot|planetwork|PleaseCrawl|Plucker|Plukkie|Plumtree|Pockey|Pockey-GetHTML|PoCoHTTP|pogodak.ba|Pogodak.co.yu|Poirot|polybot|Pompos|Poodle predictor|PopScreenBot|PostPost|PrivacyFinder|ProjectWF-java-test-crawler|ProPowerBot|ProWebWalker|PROXY|psbot|psbot-page|PSS-Bot|psycheclone|pub-crawler|pucl|pulseBot (pulse|Pump|purebot|PWeBot|pycurl|Python-urllib|pythonic-crawler|PythonWikipediaBot|q1|QEAVis agent|QFKBot|qualidade|Qualidator.com|QuepasaCreep|QueryN.Metasearch|QueryN Metasearch|quest.durato|Quintura-Crw|QunarBot|Qweery_robot.txt_CheckBot|QweeryBot|r2iBot|R6_CommentReader|R6_FeedFetcher|R6_VoteReader|RaBot|Radian6|radian6_linkcheck|RAMPyBot|RankurBot|RcStartBot|RealDownload|Reaper|REBI-shoveler|Recorder|RedBot|RedCarpet|ReGet|RepoMonkey|RepoMonkey Bait|Riddler|RIIGHTBOT|RiseNetBot|RiverglassScanner|RMA|RoboPal|Robosourcer|robot|robotek|robots|Robozilla|rogerBot|Rome Client|Rondello|Rotondo|Roverbot|RPT-HTTPClient|rtgibot|RufusBot|Runnk online rss reader|s~stremor-crawler|S2Bot|SafariBookmarkChecker|SaladSpoon|Sapienti|SBIder|SBL-BOT|SCFCrawler|Scich|ScientificCommons.org|ScollSpider|ScooperBot|Scooter|ScoutJet|ScrapeBox|Scrapy|SCrawlTest|Scrubby|scSpider|Scumbot|SeaMonkey$|Search-Channel|Search-Engine-Studio|search.KumKie.com|search.msn.com|search.updated.com|search.usgs.gov|Search Publisher|Searcharoo.NET|SearchBlox|searchbot|searchengine|searchhippo.com|SearchIt-Bot|searchmarking|searchmarks|searchmee_v|SearchmetricsBot|searchmining|SearchnowBot_v1|searchpreview|SearchSpider.com|SearQuBot|Seekbot|Seeker.lookseek.com|SeeqBot|seeqpod-vertical-crawler|Selflinkchecker|Semager|semanticdiscovery|Semantifire1|semisearch|SemrushBot|Senrigan|SEOENGWorldBot|SeznamBot|ShablastBot|ShadowWebAnalyzer|Shareaza|Shelob|sherlock|ShopWiki|ShowLinks|ShowyouBot|siclab|silk|Siphon|SiteArchive|SiteCheck-sitecrawl|sitecheck.internetseer.com|SiteFinder|SiteGuardBot|SiteOrbiter|SiteSnagger|SiteSucker|SiteSweeper|SiteXpert|SkimBot|SkimWordsBot|SkreemRBot|skygrid|Skywalker|Sleipnir|slow-crawler|SlySearch|smart-crawler|SmartDownload|Smarte|smartwit.com|Snake|Snapbot|SnapPreviewBot|Snappy|snookit|Snooper|Snoopy|SocialSearcher|SocSciBot|SOFT411 Directory|sogou|sohu-search|sohu agent|Sokitomi|Solbot|sootle|Sosospider|Space Bison|Space Fung|SpaceBison|SpankBot|spanner|Spatineo Monitor Controller|special_archiver|SpeedySpider|Sphider|Sphider2|spider|Spider.TerraNautic.net|SpiderEngine|SpiderKU|SpiderMan|Spinn3r|Spinne|sportcrew-Bot|spyder3.microsys.com|sqlmap|Squid-Prefetch|SquidClamAV_Redirector|Sqworm|SrevBot|sslbot|SSM Agent|StackRambler|StarDownloader|statbot|statcrawler|statedept-crawler|Steeler|STEGMANN-Bot|stero|Stripper|Stumbler|suchclip|sucker|SumeetBot|SumitBot|SummizeBot|SummizeFeedReader|SuperBot|superbot.com|SuperHTTP|SuperLumin|SuperPagesBot|Supybot|SURF|Surfbot|SurfControl|SurveyBot|suzuran|SWEBot|swish-e|SygolBot|SynapticWalker|Syntryx ANT Scout Chassis Pheromone|SystemSearch-robot|Szukacz).*quot; />
    <add input="{HTTP_USER_AGENT}" pattern="^.*(T-H-U-N-D-E-R-S-T-O-N-E|Tailrank|tAkeOut|TAMU_CRAWLER|TapuzBot|Tarantula|targetblaster.com|TargetYourNews.com|TAUSDataBot|taxinomiabot|Tecomi|TeezirBot|Teleport|Teleport Pro|TeleportPro|Telesoft|Teradex Mapper|TERAGRAM_CRAWLER|TerrawizBot|testbot|testing of|TextBot|thatrobotsite.com|The.Intraformant|The Dyslexalizer|The Intraformant|TheNomad|Theophrastus|theusefulbot|TheUsefulbot_|ThumbBot|thumbshots-de-bot|tigerbot|TightTwatBot|TinEye|Titan|to-dress_ru_bot_|to-night-Bot|toCrawl|Topicalizer|topicblogs|Toplistbot|TopServer PHP|topyx-crawler|Touche|TourlentaScanner|TPSystem|TRAAZI|TranSGeniKBot|travel-search|TravelBot|TravelLazerBot|Treezy|TREX|TridentSpider|Trovator|True_Robot|tScholarsBot|TsWebBot|TulipChain|turingos|turnit|TurnitinBot|TutorGigBot|TweetedTimes|TweetmemeBot|TwengaBot|TwengaBot-Discover|Twiceler|Twikle|twinuffbot|Twisted PageGetter|Twitturls|Twitturly|TygoBot|TygoProwler|Typhoeus|U.S. Government Printing Office|uberbot|ucb-nutch|UCSD-Crawler|UdmSearch|UFAM-crawler-|Ultraseek|UnChaos|unchaos_crawler_|UnisterBot|UniversalSearch|UnwindFetchor|UofTDB_experiment|updated|URI::Fetch|url_gather|URL-Checker|URL Control|URLAppendBot|URLBlaze|urlchecker|urlck|UrlDispatcher|urllib|URLSpiderPro|URLy.Warning|USAF AFKN|usasearch|USS-Cosmix|USyd-NLP-Spider|Vacobot|Vacuum|VadixBot|Vagabondo|Validator|Valkyrie|vBSEO|VCI|VerbstarBot|VeriCiteCrawler|Verifactrola|Verity-URL-Gateway|vermut|versus|versus.integis.ch|viasarchivinginformation.html|vikspider|VIP|VIPr|virus-detector|VisBot|Vishal For CLIA|VisWeb|vlad|vlsearch|VMBot|VocusBot|VoidEYE|VoilaBot|Vortex|voyager|voyager-hc|voyager-partner-deep|VSE|vspider).*quot; />
    <add input="{HTTP_USER_AGENT}" pattern="^.*(W3C_Unicorn|W3C-WebCon|w3m|w3search|wacbot|wastrix|Water Conserve|Water Conserve Portal|WatzBot|wauuu engine|Wavefire|Waypath|Wazzup|Wazzup1.0.4800|wbdbot|web-agent|Web-Sniffer|Web.Image.Collector|Web CEO Online|Web Image Collector|Web Link Validator|Web Magnet|webalta|WebaltBot|WebAuto|webbandit|webbot|webbul-bot|WebCapture|webcheck|Webclipping.com|webcollage|WebCopier|WebCopy|WebCorp|webcrawl.net|webcrawler|WebDownloader for|Webdup|WebEMailExtrac|WebEMailExtrac.*|WebEnhancer|WebFerret|webfetch|WebFetcher|WebGather|WebGo IS|webGobbler|WebImages|Webinator-search2.fasthealth.com|Webinator-WBI|WebIndex|WebIndexer|weblayers|WebLeacher|WeblexBot|WebLinker|webLyzard|WebmasterCoffee|WebmasterWorld|WebmasterWorldForumBot|WebMiner|WebMoose|WeBot|WebPix|WebReaper|WebRipper|WebSauger|Webscan|websearchbench|WebSite|websitemirror|WebSpear|websphinx.test|WebSpider|Webster|Webster.Pro|Webster Pro|WebStripper|WebTrafficExpress|WebTrends Link Analyzer|webvac|webwalk|WebWalker|Webwasher|WebWatch|WebWhacker|WebXM|WebZip|Weddings.info|wenbin|WEPA|WeRelateBot|Whacker|Widow|WikiaBot|Wikio|wikiwix-bot-|WinHttp.WinHttpRequest|WinHTTP Example|WIRE|wired-digital-newsbot|WISEbot|WISENutbot|wish-la|wish-project|wisponbot|WMCAI-robot|wminer|WMSBot|woriobot|worldshop|WorQmada|Wotbox|WPScan|wume_crawler|WWW-Mechanize|www.freeloader.com.|WWW Collector|WWWOFFLE|wwwrobot|wwwster|WWWWanderer|wwwxref|Wysigot|X-clawler|Xaldon|Xenu|Xenu's|Xerka MetaBot|XGET|xirq|XmarksFetch|XoviBot|xqrobot|Y!J|Y!TunnelPro|yacy.net|yacybot|yarienavoir.net|Yasaklibot|yBot|YebolBot|yellowJacket|yes|YesupBot|Yeti|YioopBot|YisouSpider|yolinkBot|yoogliFetchAgent|yoono|Yoriwa|YottaCars_Bot|you-dir|Z-Add Link|zagrebin|Zao|zedzo.digest|zedzo.validate|zermelo|Zeus|Zeus Link Scout|zibber-v|zimeno|Zing-BottaBot|ZipppBot|zmeu|ZoomSpider|ZuiBot|ZumBot|Zyborg|Zyte).*quot; />
    </conditions>
    <action type="CustomResponse" statusCode="403" statusReason="Forbidden" statusDescription="Forbidden" />
    </rule>
    </rules>
    </rewrite>

    <security>
    <!-- To block IP addresses, see example format below. Add to list as needed: -->
    <ipSecurity>
    <add ipAddress="203.0.113.15" allowed="false" />
    </ipSecurity>

    <!-- Deny URL requests to files with .log extension, like debug.log -->
    <requestFiltering>
    <denyUrlSequences>
    <add sequence=".log" />
    </denyUrlSequences>

    <!--
    # Block query string sequences. The following example blocks a URL
    # containing a query string "?foobar=revslider". Only use these rules if
    # you're familiar with the exact query string used.
    #
    # This one was not in HackRepair.com's .htaccess
    -->
    <denyQueryStringSequences>
    <add sequence="foobar=revslider" />
    </denyQueryStringSequences>

    </requestFiltering>

    </security>
    </system.webServer>
    </configuration>
    Code language: HTML, XML (xml)

    My WordPress web.config

    Not so long ago, I posted my WordPress web.config – that I have currently in use. It offers a lot of same functionality, therefore you’d best combine both web.config files into one.

    Block WordPress comment spammers manuallyIIS Outbound Rules with gzip compression“Rewrite Module error: Expression contains a repeat expression”
    Related PostsHow to turn off the TLS default in FileZilla?IIS 10.0 FTP IP Security allow listManually failover all databases in an SQL Server Database Mirroring configurationWMI Filters for Group Policy to manage Windows Server versionsShare this:Click to share on Twitter (Opens in new window)Click to share on Facebook (Opens in new window)Click to share on Pocket (Opens in new window)Click to share on LinkedIn (Opens in new window)Click to share on WhatsApp (Opens in new window)Click to share on Reddit (Opens in new window)Click to share on Tumblr (Opens in new window)Click to share on Pinterest (Opens in new window)Click to share on Telegram (Opens in new window)Click to share on Skype (Opens in new window)Click to email this to a friend (Opens in new window)Click to print (Opens in new window)var toolbelt_social_share_description = “Jim Walker from HackRepair.com posted a 2016 version of his Bad Bots .htaccess on Pastebin. I offered Jim to translate his Bad Bots…”;Share Tweet this
    Share this
    Share this
    Save this
    Share this

Mentions

  • Jan Reilink

Leave a Comment

Your email address will not be published. Required fields are marked *