Robots exclusion question.

They have: 16 posts

Joined: Nov 2006

Hi,

I have many hundreds of pages created by a database in the following format:

domain.com/availability.asp?id=26&theyear=2007&themonth=9
domain.com/availability.asp?id=26&theyear=2007&themonth=10
domain.com/availability.asp?id=26&theyear=2007&themonth=11

and so on.

I don't want these indexed as they are duplicates of each other.

What is the best way?

Should I ad the robots noindex/nofollow meta to each page? Or, if I exclude /availability.asp in the robots.txt, will that also dissalow all of the files woth the ?id= part too?

They have: 27 posts

Joined: Mar 2007

Hi Hampstead,

You can actually single out certain parameters for the Googlebot. Adding these lines to your to your robots.txt file tells the Googlebot not to index any URL's with "theyear" and "themonth" parameters:

User-agent: Googlebot
Disallow: /*theyear=
Disallow: /*themonth=
'

If you were to exclude /availability.asp in your robots.txt file, it would also exclude the URL's with the "id" parameter.

More info on the Googlebot wildcard here: Google Robots.txt Wildcard

Smiling

They have: 16 posts

Joined: Nov 2006

Thanks for this. I do not want any URLs with the id parameter indexed anyway so perhaps excluding /availability.asp would be the way forward?

They have: 27 posts

Joined: Mar 2007

You're welcome.

And yes, if you don't want the "id" pages indexed either, blocking the whole page is the way to go.

Smiling

Want to join the discussion? Create an account or log in if you already have one. Joining is fast, free and painless! We’ll even whisk you back here when you’ve finished.