Validator user-agent ?

Busy's picture

He has: 6,151 posts

Joined: May 2001

Would anyone know the user agent for either the w3c or wdg validator sites? (probably the same).
My htacess file is blocking so many bad bots I seem to have included them somehow.

Cheers

JeevesBond's picture

He has: 3,956 posts

Joined: Jun 2002

Is this one in your .htaccess file?

Failing that I'll try some log action tonight. Smiling

Busy's picture

He has: 6,151 posts

Joined: May 2001

Thanks, but nope, thats not included

Theres over a 100 blocked so removing them one by one would take forever.
All blocked have been bad at least 3 times, sucking bandwidth, email addresses or leeching stuff

Renegade's picture

He has: 3,022 posts

Joined: Oct 2002

W3C_Validator/1.432.2.5 - HTML
and
Jigsaw/2.2.5 W3C_CSS_Validator_JFouffa/2.0 - CSS

According to my logs.

JeevesBond's picture

He has: 3,956 posts

Joined: Jun 2002

Makes sense, but why on Earth would you have those blocked Busy? Surely you're not that daft?!

Busy's picture

He has: 6,151 posts

Joined: May 2001

None of those keywords are blocked

here is my list (smaller one that is blocking as well)
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} almaden [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Anarchie [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ASPSeek [NC,OR]
RewriteCond %{HTTP_USER_AGENT} attach [NC,OR]
RewriteCond %{HTTP_USER_AGENT} BackWeb [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Bandit [NC,OR]
RewriteCond %{HTTP_USER_AGENT} BatchFTP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Bot\ mailto:[email protected] [NC,OR]
RewriteCond %{HTTP_USER_AGENT} BlackWidow [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Buddy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} CherryPicker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ChinaClaw [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Collector [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Copier [NC,OR]
RewriteCond %{HTTP_USER_AGENT} CICC [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Crescent [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Custo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} DA [NC,OR]
RewriteCond %{HTTP_USER_AGENT} DISCo\ Pump [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Download\ Demon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Download\ Wonder [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Downloader [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Drip [NC,OR]
RewriteCond %{HTTP_USER_AGENT} DSurf15a [NC,OR]
RewriteCond %{HTTP_USER_AGENT} dts\ agent$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} EasyDL [NC,OR]
RewriteCond %{HTTP_USER_AGENT} eCatch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} EirGrabber [NC,OR]
RewriteCond %{HTTP_USER_AGENT} EmailSiphon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} EmailWolf [NC,OR]
RewriteCond %{HTTP_USER_AGENT} EyeNetIE [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Express\ WebPictures [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ExtractorPro [NC,OR]
RewriteCond %{HTTP_USER_AGENT} EyeNetIE [NC,OR]
RewriteCond %{HTTP_USER_AGENT} FileHound [NC,OR]
RewriteCond %{HTTP_USER_AGENT} FlashGet [NC,OR]
RewriteCond %{HTTP_USER_AGENT} frontpage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [NC,OR]
RewriteCond %{HTTP_USER_AGENT} GetSmart [NC,OR]
RewriteCond %{HTTP_USER_AGENT} GetWeb! [NC,OR]
RewriteCond %{HTTP_USER_AGENT} gigabaz [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Go!Zilla [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Go-Ahead-Got-It [NC,OR]
RewriteCond %{HTTP_USER_AGENT} gotit [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Grabber [NC,OR]
RewriteCond %{HTTP_USER_AGENT} GrabNet [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Grafula [NC,OR]
RewriteCond %{HTTP_USER_AGENT} grub-client [NC,OR]
RewriteCond %{HTTP_USER_AGENT} HMView [NC,OR]
RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
RewriteCond %{HTTP_USER_AGENT} httpdown [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ia_archiver [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Image\ Stripper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Image\ Sucker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} InterGET [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Internet\ Ninja [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Iria [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Irvine [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Java [NC,OR]
RewriteCond %{HTTP_USER_AGENT} JetCar [NC,OR]
RewriteCond %{HTTP_USER_AGENT} JOC [NC,OR]
RewriteCond %{HTTP_USER_AGENT} JustView [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [NC,OR]
RewriteCond %{HTTP_USER_AGENT} LeechFTP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} LexiBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} lftp [NC,OR]
RewriteCond %{HTTP_USER_AGENT} linkwalker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} likse [NC,OR]
RewriteCond %{HTTP_USER_AGENT} marcopolo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Magnet [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Mag-Net [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Mass\ Downloader [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Memo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} MIDown\ tool [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Mirror [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Mister\ PiX [NC,OR]
RewriteCond %{HTTP_USER_AGENT} MJ12bot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Moozilla [NC,OR]
RewriteCond %{HTTP_USER_AGENT} MS\ FrontPage [NC,OR]
RewriteCond %{HTTP_USER_AGENT} MSIECrawler [NC,OR]
RewriteCond %{HTTP_USER_AGENT} MSProxy [NC,OR]
RewriteCond %{HTTP_USER_AGENT} NaverRobot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NaverBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Navroad [NC,OR]
RewriteCond %{HTTP_USER_AGENT} NearSite [NC,OR]
RewriteCond %{HTTP_USER_AGENT} NetAnts [NC,OR]
RewriteCond %{HTTP_USER_AGENT} NetSpider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Net\ Vampire [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Netwu [NC,OR]
RewriteCond %{HTTP_USER_AGENT} NetZip [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Ninja [NC,OR]
RewriteCond %{HTTP_USER_AGENT} NICErsPRO [NC,OR]
RewriteCond %{HTTP_USER_AGENT} NPBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} obot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Octopus [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Offline\ Explorer [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Offline\ Navigator [NC,OR]
RewriteCond %{HTTP_USER_AGENT} PageGrabber [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Papa\ Foto [NC,OR]
RewriteCond %{HTTP_USER_AGENT} pavuk [NC,OR]
RewriteCond %{HTTP_USER_AGENT} pcBrowser [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Pockey [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Program\ Shareware [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^psbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Pump [NC,OR]
RewriteCond %{HTTP_USER_AGENT} QuepasaCreep [NC,OR]
RewriteCond %{HTTP_USER_AGENT} RealDownload [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Reaper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Recorder [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ReGet [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Siphon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} SiteSnagger [NC,OR]
RewriteCond %{HTTP_USER_AGENT} SlySearch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} SmartDownload [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Snake [NC,OR]
RewriteCond %{HTTP_USER_AGENT} SpaceBison [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Steeler [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Stripper [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Sucker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} SuperBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} SuperHTTP [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Surfbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} SlySearch [NC,OR]
RewriteCond %{HTTP_USER_AGENT} tAkeOut [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Teleport\ Pro [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^TurnitinBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Tutorial\ Crawler [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Vacuum [NC,OR]
RewriteCond %{HTTP_USER_AGENT} VoidEYE [NC,OR]
RewriteCond %{HTTP_USER_AGENT} WebCapture [NC,OR]
RewriteCond %{HTTP_USER_AGENT} WebCopier [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Webster [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Web(\ Image|\ Sucker|Auto|bandit|Fetch|site|ZIP|.*er) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Wget [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Whacker [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Widow [NC,OR]
RewriteCond %{HTTP_USER_AGENT} WISEnutbot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Xaldon [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Zao [NC,OR]
RewriteCond %{HTTP_USER_AGENT} zyborg
RewriteRule /*$ http://www.no-one-home.com [L,R]

Am sure they not in that list, only other sites block are by IP and none were even close to your orginal link Jeeves

JeevesBond's picture

He has: 3,956 posts

Joined: Jun 2002

Ok, stupid question coming up: If you take these rules out of Apache, validation works?

If so, what happens if you remove only the ip's (as we know the user agents are ok)?

a Padded Cell our articles site!

Busy's picture

He has: 6,151 posts

Joined: May 2001

To be honest I haven't tried removing them as I was hoping it would be an easy fix instead of going through each one. Even doing that it may serve from cache (even with f5) so could miss a line.

IP's are:
deny from 24.6.247.154
deny from 24.98.11.213
deny from 38.118.42.38
deny from 61.18.186.2
deny from 62.31.52.41
deny from 63.101.52.139
deny from 63.148.99.224
deny from 63.164.41.43
deny from 64.37.103.34
deny from 64.86.121.83
deny from 64.94.199.9
deny from 65.29.185.5
deny from 65.94.166.55
deny from 65.94.248.170
deny from 65.94.226.152
deny from 65.102.12.225
deny from 65.102.23.153
deny from 65.102.23.161
deny from 65.102.23.169
deny from 66.114.67.120
deny from 66.119.33.170
deny from 66.147.154.3
deny from 66.196.65.49
deny from 66.196.65.50
deny from 66.196.65.51
deny from 66.196.65.52
deny from 66.196.65.53
deny from 66.196.65.54
deny from 66.196.65.55
deny from 66.196.65.56
deny from 66.196.65.57
deny from 66.196.65.58
deny from 66.196.65.59
deny from 66.207.120.227
deny from 66.232.40.147
deny from 67.68.197.100
deny from 67.68.197.102
deny from 67.68.198.71
deny from 67.121.213.66
deny from 69.41.14.3
deny from 80.180.146.244
deny from 80.218.79.29
deny from 81.199.108.5
deny from 82.34.77.6
deny from 82.34.77.78
deny from 131.107.163.49
deny from 142.31.155.216
deny from 168.75.177.2
deny from 172.180.225.231
deny from 195.154.174.164
deny from 195.166.67.98
deny from 203.168.223.161
deny from 207.44.200.37
deny from 208.8.57.2
deny from 208.254.8.86
deny from 210.86.52.184
deny from 211.28.59.154
deny from 211.99.203.199
deny from 211.99.213.14
deny from 211.99.213.20
deny from 211.152.14.91
deny from 211.152.14.92
deny from 211.152.14.93
deny from 211.152.14.94
deny from 211.152.14.95
deny from 211.152.14.96
deny from 211.154.167.242
deny from 211.154.171.120
deny from 211.157.8.41
deny from 211.157.8.42
deny from 211.157.8.43
deny from 211.157.8.44
deny from 211.157.8.45
deny from 211.157.8.46
deny from 212.219.92.233
deny from 212.234.120.220
deny from 213.8.63.6
deny from 213.138.110.24
deny from 213.250.116.96
deny from 216.88.158.142
deny from 216.237.226.228
deny from 217.24.128.204
deny from 217.73.164.106
deny from 218.252.233.116

Busy's picture

He has: 6,151 posts

Joined: May 2001

Stupid me, the IP's aren't at fault as they would just be denied access, it has to be the user agent as it's being kicked to no-one-home.com (or whoever I point I to)

Want to join the discussion? Create an account or log in if you already have one. Joining is fast, free and painless! We’ll even whisk you back here when you’ve finished.