Strange page view errors ...

They have: 54 posts

Joined: Oct 2001

I'm not sure this is the right spot to post this, but wasn't sure where else to put it!

I've noticed an odd page view error showing up in my site statistics very regularly of late and I'm just not sure what might be causing it. The page is listed simple as: / ... our domain is ethics.org.au and all page errors are listed as pages from the root directory - so, for instance, it might list /blah.htm as a page which generated an error.

So, logic tells me that the error is being generated by people going to ethics.org.au/ but when I enter this myself it goes to the home page with no problem. We're getting something like 4000 of these page errors a week, which is why it's bothering me. Does anybody have any clues as to what might be triggering it and if I should actually be worried about it?

Thanks!

Busy's picture

He has: 6,151 posts

Joined: May 2001

are you able to look at your raw log files, if so can you copy and paste a entry line (just one line) so can see the IP and UA

My guess is it's a bot or script, if so you can block the IP (if it's just one) or the UA of the bot/script

They have: 54 posts

Joined: Oct 2001

Hi Busy

Unfortunately I'm not looking at raw scripts but at an online stats service provided by our ISP - so it processes the logs for us. We can organise raw logs (for an additional fee :/) ... perhaps if we're getting odd results this is worth doing?

Renegade's picture

He has: 3,022 posts

Joined: Oct 2002

Well a "/" shows up in my site stats pages too, I think its just where people type in "www.yourdomain.com" or "www.yourdomain.com/" its nothing to worry about; its just your index.php or index.html page. Smiling

Busy's picture

He has: 6,151 posts

Joined: May 2001

they charge you an additional fee to view your log files - free host?

Ok maybe we can find a free way around this
*thinking cap on - cough cough - smoke clears - 2 watt lightbulb goes on*

Does your online stats show the ISP of the person or a UA or anything.
Are you able to use any server side scripts, PHP, ASP, PERl ... ?
if you can we can make you a basic log file to see whos coming in and doing this.

I suppose you should check with your host if you can use a .htaccess file first before we go on, this could be in vain (could use Javascript to redirect them but that can be turned off)

They have: 54 posts

Joined: Oct 2001

It's a paid host - the logs issue strikes me as a bit odd too! They don't charge much for getting them, but I've just never bothered to organise it as the online stats were okay until I started seeing odd things.

The host runs a Microsoft environment, so I'm not sure what will work best. We run ASP on the site. I think from memory we can run PHP but not PERL.

Maybe it's a good idea to just pay them the AU$16 a month and see the raw logs. Hopefully that will give a much better clue as to what's going on!

Busy's picture

He has: 6,151 posts

Joined: May 2001

some (stupid) bots cause simple errors in logs trying to download/spider your site, I've banned about 40 so far for trying to get things like: myurl/" or myurl// or even doubled up myurl/myurl

*myurl is my actual url, didnt want to spam my sites ;)*

Busy's picture

He has: 6,151 posts

Joined: May 2001

$16AU a month to view your logs? crikey thats nearly as much as I pay a month for 800mb of webspace
It could be nothing as Renegade said, so it's up to you but we could make up a script (PHP or something) to track em down

They have: 54 posts

Joined: Oct 2001

Okay, I have organised the raw log files as I'm keen to get to the bottom of this! Using some basic software to analyse the logs I've realised that the actual error being triggered is a 411 'length required' error.

Looking at the raw logs it seems that these sorts of entries are responsible:

Mozilla/4.0+(compatible;+MSIE+5.5;+Windows+98) - -
2003-10-19 00:00:19 65.37.48.158 - W3SVC60 BNE117V 203.147.160.60 80 SEARCH / - 411 0 245 43 0 HTTP/1.1 - - -
2003-10-19 00:01:14 218.86.173.53 - W3SVC60 BNE117V 203.147.160.60 80 GET /index.htm - 200 64 0 190 651 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+5.5;+Windows+98) - -
2003-10-19 00:01:14 218.86.173.53 - W3SVC60 BNE117V 203.147.160.60 80 SEARCH / - 411 0 245 43 0 HTTP/1.1 - - -

Hmmm ... anyone have any clues? From what I've read about 411 errors so far it looks like my host needs to sort this one out.

Bugger ... and my stats were looking really quite good for a bit and now I think lots of the extra traffic is probably due to whatever is causing this error!

Busy's picture

He has: 6,151 posts

Joined: May 2001

do you have a search form on your site? if so is it your own, your hosts, a 3rd parties or something an alien planted? Wink
out of those 3 pages listed, only 2 are errors, they got the index.htm page

I did a search on the web and it's not an uncommon problem, can't get a straight answer but it seems someone could be using a home grown browser for browsing or up to no good, do you have any files/folders that are off limits? have these been viewed?

It looks like a bot (no UA on the search ones) but can't be 100%

They have: 54 posts

Joined: Oct 2001

Hi Busy! We do have a search form, yes. It's the free version offered by atomz.com. There's a basic form on all pages of the site - the results pages appear on an atomz server rather than our own and atomz indexes the site automatically once a week. But perhaps I've taken the wrong lines out of our logs and those aren't the relevant ones at all. I'll have another look.

Hmmm ... seems like a bit of mystery. I've asked our hosts about it - no response yet.

Busy's picture

He has: 6,151 posts

Joined: May 2001

I edited my post (found out more), have a look for fidden files/folders that may have been viewed.

I use atmoz as well, good search tool, bu it's not them doing it, they have a UA (name as refer).

They have: 54 posts

Joined: Oct 2001

Hi again Busy! We do have a database directory which is secure. We also have an SSL folder, which may be secure - I didn't set it up. Also, I've had a response from our hosts as follows:

****
This is an odd one! It looks as if various connections are being made to your site under the assumption it is actually a IP printer. Although it could well be an attempted exploit. I do not believe you need to be concerned as these hits are not potential visitors that you are losing and the fact that they are page errors means they have not been able to access your server. This type of traffic is extremely common on a high proportion of websites and not something which you need to lose any sleep over.
****

Hmmm ... I am a bit concerned because it's probably quite badly skewing my visitor statistics, although in a way which makes the site look very popular!

Renegade's picture

He has: 3,022 posts

Joined: Oct 2002

eh, don't worry about it.

Busy's picture

He has: 6,151 posts

Joined: May 2001

Yeah, do a search on google for 411 errors and see what you think, I'd be inclined to set up a trap for it, see if it is a bot or not. If it'a all from the same IP range you can block the range with .htaccess

If it's not doing any damage and increasing your stats it can't be a bad thing can it Wink

Want to join the discussion? Create an account or log in if you already have one. Joining is fast, free and painless! We’ll even whisk you back here when you’ve finished.