robots file
If I ban all search engines with a robots.txt, obviously the rogue ones will ignore it. I presume the good ones (google, yahoo etc) will still honour the robots.txt they found and ignore anything they find elsewhere?
Such as a link on a 3rd party website or another search engine masquerading as a website of content.
So ones that obey the robots will never index anything from my site regardless of what and where they find stuff?
mrgilb posted this at 15:15 — 30th September 2009.
They have: 25 posts
Joined: Sep 2009
Thanks for information, keep it up with robots.txt.
Michael James Swan posted this at 12:02 — 1st January 2010.
He has: 400 posts
Joined: May 2008
I remember a case a little while ago in which someone did this but there pages still ended up within Googles Index.
This happened because someone else had linked to a page which "Robots, did not allow to be indexed" so it was not indexed; but merly noticed and listed by the Search Engine.
sequencehosting posted this at 22:48 — 25th February 2010.
They have: 24 posts
Joined: Feb 2010
This happened because someone else had linked to a page which "Robots, did not allow to be indexed" so it was not indexed; but merly noticed and listed by the Search Engine.
This is true and still happens today. I believe the best way to remove a site is using web master tools.
joshbectt posted this at 01:42 — 10th January 2010.
They have: 5 posts
Joined: Jan 2010
I have had a problem where the robots did still come to my site. But then again, it never really hurt me at all.
jakebrown posted this at 13:05 — 26th February 2010.
They have: 12 posts
Joined: Feb 2010
Usually your site content, if crawled by scrapers and reposted will show up on Google including any links pointing to your site. But Google will usually obey robots.txt and not show your website directly in results.
Robots which do not obey the protocol can still crawl your website. Content scrapers can still work.
Website Design Company Irvine
johnson.kelly12 posted this at 07:00 — 28th April 2010.
They have: 36 posts
Joined: Apr 2010
will this robots.txt file work with all search engines
Buy Mobile Phone - Mobile Phone Deals
freddavis posted this at 18:20 — 26th October 2010.
They have: 22 posts
Joined: Oct 2010
robot.txt files will help you to hide that content that yiu dont want to show to search engines....this file works on all search engines...
faca5 posted this at 14:44 — 7th December 2010.
They have: 20 posts
Joined: Nov 2010
You can ban all search engine using robot.txt. But some search enigine meybe doesnt check your file and index your page.
{links removed}
joomlads07 posted this at 07:08 — 4th December 2010.
They have: 5 posts
Joined: Feb 2010
The robots.txt file is a set of instructions for visiting robots (spiders) that index the content of your web site pages. For those spiders that obey the file, it provides a map for what they can, and cannot index. The file must reside in the root directory of your web.
faca5 posted this at 13:24 — 21st December 2010.
They have: 20 posts
Joined: Nov 2010
meybe you can try to ban user-agent?!
arthur posted this at 06:40 — 22nd December 2010.
They have: 111 posts
Joined: Aug 2010
the functionality of robot.txt file is to avoid or to not allow search engine from some of your website pages, this is done whenever if some of your pages are in construction phase, and you dont want search engine to index them.. you can put them there,
robots.txt file can placed in the main directory with index.html page...the same place..
Want to join the discussion? Create an account or log in if you already have one. Joining is fast, free and painless! We’ll even whisk you back here when you’ve finished.