SEOHigh.Com SEO/SEM Services

Toronto Search Engine Optimization (SEO) Search Engine Marketing (SEM)

 

 
<< Previous    [1]  2  3    Next >>

How to Control Search Engine Spiders

Robots.txt Implementation


One of the most fundamental steps when optimizing a website is writing a robots.txt file. It helps tell spiders what is useful and public for sharing in the search engine indexes and what is not. It should also be noted that not all search spiders will follow your instructions left in the robots.txt file. In addition, a poorly done robots.txt file can stop the search spiders from crawling and indexing your website properly. In this article I will show you how to be sure everything will work correctly.


While there are many other SEOs who will tell you that a robots.txt file will not improve your rankings, I would disagree, in order for the robots to index your site properly, they need instruction on which folders or files to not crawl or index, as well as which ones you want to have indexed.


Another good reason to use the robots.txt file is because many of the search engines tell the public to use them on their websites. Below is a quote taken from Google:


Make use of the robots.txt file on your web server. This file tells crawlers which directories can or cannot be crawled. Make sure it's current for your site so that you don't accidentally block the Googlebot crawler.


Even though others feel this is of no use unless you are blocking content, keep this in mind; when a search engine goes out of their way (and this is the tightest-lipped search engine ever) to tell us to use something, it is usually to ones advantage to follow the little clues we are offered.
Also if you read your stats file on your web hosting server, you will usually find the URL to your robots.txt being requested. If a search bot asks for the robots.txt and does not find it on your server, the spider often just leaves.

How do you build a robots.txt file for your website? I am glad you asked. One thing you do not want to do is use an HTML editor to build this file. The easiest way to create the file is with a text editor like Notepad. After opening Notepad (or another text editor), save the blank file as robots.txt. This file will be placed on the root level of your web server, or in other words the same folder as your index page, once it is complete.

Now I will cover several different methods of efficiently using a robots.txt file to direct the robot to crawl the correct directories and and avoid others.
First we will discuss how to format information. The text file is actually a list. Its directions consist of two fields, or lines of instruction.

<< Previous    [1]  2  3    Next >>