htaccess question

Dave G

Member
Good Day All
Is there a htacces file at the server/root level? I can't seem to find one or any info if there is one and if there is it's location.
I am asking because I would like to block some search bot's before they get to any of my customers sites.

Thanks
 

Dave G

Member
OK so I created a .htaccess file in my root/home directory and have put the following code in it but it doesn't seem to be working?
Did I do something wrong?
Thanks

Code:
SetEnvIfNoCase User-Agent ^$ bad_bot
SetEnvIfNoCase User-Agent "^SeznamBot" bad_bot
SetEnvIfNoCase User-Agent "^YandexBot" bad_bot
SetEnvIfNoCase User-Agent "^YandexImages" bad_bot
SetEnvIfNoCase User-Agent "^DotBot" bad_bot
SetEnvIfNoCase User-Agent "^AhrefsBot" bad_bot
SetEnvIfNoCase User-Agent "^Qwantify" bad_bot
# SetEnvIfNoCase User-Agent "^Enter User-Agent" bad_bot
# SetEnvIfNoCase User-Agent "^Enter User-Agent" bad_bot
# SetEnvIfNoCase User-Agent "^Enter User-Agent" bad_bot
# SetEnvIfNoCase User-Agent "^Enter User-Agent" bad_bot
# SetEnvIfNoCase User-Agent "^Enter User-Agent" bad_bot
# SetEnvIfNoCase User-Agent "^Enter User-Agent" bad_bot

<Limit GET POST HEAD>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit>
 

phpAddict

Active Member
Not sure why that wouldn't be working for you, but I've used this tool to create a variety of htaccess files in the past. It has a "Block bots" function that is formatted a little differently than what you have there. Thought maybe it will help you.
 

Dave G

Member
PHP
Thanks for the link I've bookmarked it.
So my next question and I hope I can explain it correctly
I have a rule: RewriteCond %{HTTP_USER_AGENT} ^YandexBot [OR]
Yet this AM I found YandexBot/3.0 wandering around I thought the "^" thing was suppose to say "Block any user-agent containing YandexBot"?
Should I/Could I use something else so as these bot's change there name or add numbers they will be blocked also maybe a "*"?

OOOOO so many questions:)

Thanks
 

phpAddict

Active Member
It uses regular expressions. The caret symbol means the condition must match the beginning of the string in your ^YandexBot example. So that means it should have matched ^YandexBot with YandexBot/3.0 unless the user agent started with something else. Yes, the asterisk works like you're thinking, but it should have matched the way it is currently. Something certainly is not right.

If you've changed the code please post the new .htaccess code you're using. Also, to test have you tried placing that .htaccess file in the account's public_html directory rather than in the home directory? Maybe it's not working for some reason where it's at currently, file permissions or changes to apache's config.
 

Dave G

Member
Thank for the explanation here is my current code.
I have copied it to my customers htaccess this AM now it's a wait and see.

These 2 bots that came back and shouldn't have
1/26/17, 5:47 AM Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, help@moz.com)
1/26/17, 5:43 AM Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)

Code I created using the site you gave me:
Code:
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^YandexBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^YandexImages [OR]
RewriteCond %{HTTP_USER_AGENT} ^DotBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^AhrefsBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Qwantify [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^YandexBot/3.0 [OR]
RewriteCond %{HTTP_USER_AGENT} ^YisouSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^BLEXBot/1.0 [OR]
RewriteCond %{HTTP_USER_AGENT} ^$
RewriteRule ^.* - [F,L]

Thanks
 
Top