<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Free Web Resources - Web Resources Depot &#187; Robots.txt</title>
	<atom:link href="http://www.webresourcesdepot.com/tag/robotstxt/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.webresourcesdepot.com</link>
	<description>Free Web Resources</description>
	<lastBuildDate>Fri, 19 Mar 2010 06:22:40 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>How To Use Robots.Txt File?</title>
		<link>http://www.webresourcesdepot.com/how-to-use-robotstxt-file/</link>
		<comments>http://www.webresourcesdepot.com/how-to-use-robotstxt-file/#comments</comments>
		<pubDate>Thu, 24 Jan 2008 17:06:42 +0000</pubDate>
		<dc:creator>Umut M.</dc:creator>
				<category><![CDATA[Extras]]></category>
		<category><![CDATA[No License]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Robots.txt]]></category>
		<category><![CDATA[Search Engine]]></category>
		<category><![CDATA[Seo]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://www.webresourcesdepot.com/how-to-use-robotstxt-file/</guid>
		<description><![CDATA[Robots.txt file usage is sometimes ignored. On the other hand, it is an important factor for the webpages being indexed properly and very easy to setup.
I know that robots.txt is not something new. But, I&#8217;ve been preparing a SEO sheet for a while and wanted to share this small &#38; useful portion with you.
What is [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Robots.txt file</strong> usage is sometimes ignored. On the other hand, it is an important factor for the webpages being indexed properly and very easy to setup.</p>
<p>I know that <strong>robots.txt</strong> is not something new. But, I&#8217;ve been preparing a SEO sheet for a while and wanted to share this small &amp; useful portion with you.</p>
<h3>What is robots.txt?</h3>
<p><strong>Robots.txt</strong> is a file that is used to exclude content from the crawling process of search engine spiders / bots. Robots.txt is also called the Robots Exclusion Protocol.</p>
<h3>Why to use robots.txt?</h3>
<p>In general, we prefer that our webpages are indexed by the search engines. But there may be some content that we don&#8217;t want to be crawled &amp; indexed. Like the personal images folder, website administration folder, customer&#8217;s test folder of a web developer, no search value folders like cgi-bin, and many more. The main idea is we don&#8217;t want them to be indexed.</p>
<h3>Is robots.txt file a certain solution?</h3>
<p>No. Standards based bots like Google&#8217;s, Yahoo&#8217;s or other big search engine&#8217;s robots listen to your <strong>robots.txt</strong> file. This is because they are programmed to. If configured so, any search engine bot can ignore the <strong>robots.txt</strong> file. Result: there is no guarantee.</p>
<h3>How to use robot.txt file?</h3>
<p><strong>Robots.txt</strong> file has some simple directives which manages the bots. These are:</p>
<ul>
<li><strong>User-agent:</strong> this parameter defines, for which bots the next parameters will be valid. <strong>*</strong> is a wildcard which means all bots or Googlebot for Google.</li>
<li><strong>Disallow:</strong> defines which folders or files will be excluded. None means nothing will be excluded, <strong>/</strong> means everything will be excluded or <strong>/folder name/</strong> or <strong>/filename</strong> can be used to specify the values to excluded. Folder name between slashes like /folder name/ means that only folder name/default.html will be excluded. Using 1 slash like /folder name means all content inside the folder name folder will be excluded.</li>
</ul>
<p>There are also some other parameters which are only supported by all browsers. These are:</p>
<ul>
<li><strong>Allow:</strong> this parameter works just the opposite of <strong>Disallow</strong>. You can mention which content will be allowed to be crawled here. <strong>*</strong> is a wildcard.</li>
<li><strong>Request-rate:</strong> defines pages/seconds to be crawled ratio. <strong>1/20</strong> would be 1 page in every 20 second.</li>
<li><strong>Crawl-delay:</strong> defines howmany seconds to wait after each succesful crawling.</li>
<li><strong>Visit-time:</strong> you can define between which hours you want your pages to be crawled. Example usage is: <strong>0100-0330</strong> which means that pages will be indexed between 01:00 AM &#8211; 03:30 AM GMT.</li>
<li><strong>Sitemap:</strong> this is the parameter where you can show where your sitemap file is. You must use the complete URL addres for the file.</li>
</ul>
<h3>Robots.txt example:</h3>
<p>User-agent: * #allows all search engine spiders.<br />
Disallow: /secretcontent/ #disallow them to crawl secretcontent folder.</p>
<p><strong>Resources:</strong><br />
<a target="_blank" href="http://www.google.com/support/webmasters/bin/answer.py?hl=en&amp;answer=40360">http://www.google.com/support/webmasters/bin/answer.py?hl=en&amp;answer=40360</a><br />
<a target="_blank" href="http://www.robotstxt.org/">http://www.robotstxt.org/</a><br />
<a target="_blank" href="http://www.searchtools.com/robots/robots-txt.html">http://www.searchtools.com/robots/robots-txt.html</a><br />
<a target="_blank" href="http://en.wikipedia.org/wiki/Robots.txt">http://en.wikipedia.org/wiki/Robots.txt</a></p>
<p><strong>Special Downloads:</strong><br />
<a href="http://www.webresourcesdepot.com/?download=jBasket" target="_blank">Ajaxed Add-To-Basket Scenarios With jQuery And PHP</a><br />
<a href="http://www.webresourcesdepot.com/?download=Free-Admin-Template" target="_blank">Free Admin Template For Web Applications</a><br />
<a href="http://www.webresourcesdepot.com/?download=jQuery-Dynamic-Drag-Drop" target="_blank">jQuery Dynamic Drag&#8217;n Drop</a><br />
<a href="http://www.webresourcesdepot.com/?download=sTwitter-1-0" target="_blank">ScheduledTweets</a></p>
<p><strong>Advertisements:</strong><br />
<a href="http://www.sslmatic.com" target="_blank">SSLmatic &#8211; Cheap SSL Certificates (from $19.99/year)</a><br />
<a href="http://twitter.com/umutm" target="_blank">Follow WebResourcesDepot At Twitter And Get More Resources!</a></p>

	Tags: <a href="http://www.webresourcesdepot.com/tag/google/" title="Google" rel="tag">Google</a>, <a href="http://www.webresourcesdepot.com/tag/robotstxt/" title="Robots.txt" rel="tag">Robots.txt</a>, <a href="http://www.webresourcesdepot.com/tag/search-engine/" title="Search Engine" rel="tag">Search Engine</a>, <a href="http://www.webresourcesdepot.com/tag/seo/" title="Seo" rel="tag">Seo</a>, <a href="http://www.webresourcesdepot.com/tag/yahoo/" title="Yahoo" rel="tag">Yahoo</a><br />

	<h4>Related posts</h4>
	<ul class='st-related-posts'>
	<li><a href="http://www.webresourcesdepot.com/easy-keyword-position-analysis-exactfactor/" title="Easy Keyword Position Analysis: Exactfactor (December 2, 2008)">Easy Keyword Position Analysis: Exactfactor</a></li>
	<li><a href="http://www.webresourcesdepot.com/web-ceo-a-quality-and-free-seo-software/" title="Web CEO: A Quality And Free SEO Software (January 18, 2008)">Web CEO: A Quality And Free SEO Software</a></li>
	<li><a href="http://www.webresourcesdepot.com/graph-connections-between-related-websites/" title="Graph Connections Between Related Websites (February 17, 2008)">Graph Connections Between Related Websites</a></li>
	<li><a href="http://www.webresourcesdepot.com/1-api-to-rule-all-maps-mapstraction/" title="1 API To Rule All Maps: Mapstraction (April 3, 2008)">1 API To Rule All Maps: Mapstraction</a></li>
	<li><a href="http://www.webresourcesdepot.com/yahoo-media-player/" title="Yahoo Media Player (January 9, 2008)">Yahoo Media Player</a></li>
</ul>

]]></content:encoded>
			<wfw:commentRss>http://www.webresourcesdepot.com/how-to-use-robotstxt-file/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

<!-- This site's performance optimized by W3 Total Cache. Dramatically improve the speed and reliability of your blog!

Learn more about our WordPress Plugins: http://www.w3-edge.com/wordpress-plugins/

Minified using disk
Page Caching using disk (enhanced) (user agent is rejected)
Database Caching 7/21 queries in 0.044 seconds using disk

Served from: www.webresourcesdepot.com @ 2010-03-20 04:52:07 -->