Results 1 to 5 of 5

Thread: how to use robot.text ?

  1. #1
    Join Date
    Jan 2012
    Posts
    26

    how to use robot.text ?

    how to use robot.text ?

  2. #2
    Join Date
    Dec 2011
    Posts
    98
    the robot.txt is set for search engine .
    Using a robots.txt file is easy, but does require access to your server's root location. For instance, if your site is located at:
    http://adomain.com/mysite/index.html
    you will need to be able to create a file located here:
    http://adomain.com/robots.txt
    If you cannot access your server's root location you will not be able to use a robots.txt file to exclude pages from your index.

    The robots.txt is a TEXT file (not HTML!) which has a section for each robot to be controlled. Each section has a user-agent line which names the robot to be controlled and has a list of "disallows" and "allows". Each disallow will prevent any address that starts with the disallowed string from being accessed. Similarly, each allow will permit any address that starts with the allowed string from being accessed. The (dis)allows are scanned in order, with the last match encountered determining whether an address is allowed to be used or not. If there are no matches at all then the address will be used.

  3. #3
    Join Date
    Oct 2011
    Posts
    53
    Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol.

    It works likes this: a robot wants to vists a Web site URL, say http://www.example.com/welcome.html. Before it does so, it firsts checks for http://www.example.com/robots.txt, and finds:

    User-agent: *
    Disallow: /

    The "User-agent: *" means this section applies to all robots. The "Disallow: /" tells the robot that it should not visit any pages on the site.

    There are two important considerations when using /robots.txt:

    robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.
    the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don't want robots to use.

  4. #4
    Join Date
    May 2012
    Location
    Kuala Lumpur, Malaysia
    Posts
    37
    If you cannot access your server's root location you will not be able to use a robots.txt file to exclude pages from your index.
    For those who unable to use robots.txt, they can implement it to meta tag. e.g
    <meta name="robots" content="noindex, nofollow">

    Full explanation of robots.txt and no follow usage.
    http://www.seobook.com/robots-txt-vs...obots-nofollow

  5. #5
    Join Date
    May 2012
    Location
    Columbus, OH
    Posts
    13
    Here is a definitive site http://www.robotstxt.org/robotstxt.html

    Here is a link to an generator for robots.txt http://www.basisoft.com/

    Make sure you don't exclude search engines from the public areas of your site, you want to keep those available for the public to find.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •