<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Free Web Resources - Web Resources Depot &#187; Search Engine</title>
	<atom:link href="http://www.webresourcesdepot.com/tag/search-engine/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.webresourcesdepot.com</link>
	<description>Free Web Resources</description>
	<lastBuildDate>Sun, 12 Feb 2012 13:06:12 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.4</generator>
		<item>
		<title>How To Use Robots.Txt File?</title>
		<link>http://www.webresourcesdepot.com/how-to-use-robotstxt-file/</link>
		<comments>http://www.webresourcesdepot.com/how-to-use-robotstxt-file/#comments</comments>
		<pubDate>Thu, 24 Jan 2008 17:06:42 +0000</pubDate>
		<dc:creator>Umut M.</dc:creator>
				<category><![CDATA[Extras]]></category>
		<category><![CDATA[No License]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Robots.txt]]></category>
		<category><![CDATA[Search Engine]]></category>
		<category><![CDATA[Seo]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://www.webresourcesdepot.com/how-to-use-robotstxt-file/</guid>
		<description><![CDATA[<p><a href='http://rss.buysellads.com/click.php?z=1259982&k=ed230295611f656daf3115e6d682ca7d&a=97&c=10537' target='_blank'><img src='http://rss.buysellads.com/img.php?z=1259982&k=ed230295611f656daf3115e6d682ca7d&a=97&c=10537' border='0' alt='' /></a></p><p><a href='http://buysellads.com/buy/sitedetails/pubkey/ed230295611f656daf3115e6d682ca7d/zone/1259982' target='_blank'>Advertise here with BSA</a></p><br />Robots.txt file usage is sometimes ignored. On the other hand, it is an important factor for the webpages being indexed properly and very easy to setup. I know that robots.txt is not something new. But, I&#8217;ve been preparing a SEO sheet for a while and wanted to share this small &#38; useful portion with you. [...]]]></description>
			<content:encoded><![CDATA[<p><a href='http://rss.buysellads.com/click.php?z=1259982&k=ed230295611f656daf3115e6d682ca7d&a=97&c=15899' target='_blank'><img src='http://rss.buysellads.com/img.php?z=1259982&k=ed230295611f656daf3115e6d682ca7d&a=97&c=15899' border='0' alt='' /></a></p><p><a href='http://buysellads.com/buy/sitedetails/pubkey/ed230295611f656daf3115e6d682ca7d/zone/1259982' target='_blank'>Advertise here with BSA</a></p><br /><p><strong>Robots.txt file</strong> usage is sometimes ignored. On the other hand, it is an important factor for the webpages being indexed properly and very easy to setup.</p>
<p>I know that <strong>robots.txt</strong> is not something new. But, I&#8217;ve been preparing a SEO sheet for a while and wanted to share this small &amp; useful portion with you.</p>
<h3>What is robots.txt?</h3>
<p><strong>Robots.txt</strong> is a file that is used to exclude content from the crawling process of search engine spiders / bots. Robots.txt is also called the Robots Exclusion Protocol.</p>
<h3>Why to use robots.txt?</h3>
<p>In general, we prefer that our webpages are indexed by the search engines. But there may be some content that we don&#8217;t want to be crawled &amp; indexed. Like the personal images folder, website administration folder, customer&#8217;s test folder of a web developer, no search value folders like cgi-bin, and many more. The main idea is we don&#8217;t want them to be indexed.</p>
<h3>Is robots.txt file a certain solution?</h3>
<p>No. Standards based bots like Google&#8217;s, Yahoo&#8217;s or other big search engine&#8217;s robots listen to your <strong>robots.txt</strong> file. This is because they are programmed to. If configured so, any search engine bot can ignore the <strong>robots.txt</strong> file. Result: there is no guarantee.</p>
<h3>How to use robot.txt file?</h3>
<p><strong>Robots.txt</strong> file has some simple directives which manages the bots. These are:</p>
<ul>
<li><strong>User-agent:</strong> this parameter defines, for which bots the next parameters will be valid. <strong>*</strong> is a wildcard which means all bots or Googlebot for Google.</li>
<li><strong>Disallow:</strong> defines which folders or files will be excluded. None means nothing will be excluded, <strong>/</strong> means everything will be excluded or <strong>/folder name/</strong> or <strong>/filename</strong> can be used to specify the values to excluded. Folder name between slashes like /folder name/ means that only folder name/default.html will be excluded. Using 1 slash like /folder name means all content inside the folder name folder will be excluded.</li>
</ul>
<p>There are also some other parameters which are only supported by all browsers. These are:</p>
<ul>
<li><strong>Allow:</strong> this parameter works just the opposite of <strong>Disallow</strong>. You can mention which content will be allowed to be crawled here. <strong>*</strong> is a wildcard.</li>
<li><strong>Request-rate:</strong> defines pages/seconds to be crawled ratio. <strong>1/20</strong> would be 1 page in every 20 second.</li>
<li><strong>Crawl-delay:</strong> defines howmany seconds to wait after each succesful crawling.</li>
<li><strong>Visit-time:</strong> you can define between which hours you want your pages to be crawled. Example usage is: <strong>0100-0330</strong> which means that pages will be indexed between 01:00 AM &#8211; 03:30 AM GMT.</li>
<li><strong>Sitemap:</strong> this is the parameter where you can show where your sitemap file is. You must use the complete URL addres for the file.</li>
</ul>
<h3>Robots.txt example:</h3>
<p>User-agent: * #allows all search engine spiders.<br />
Disallow: /secretcontent/ #disallow them to crawl secretcontent folder.</p>
<p><strong>Resources:</strong><br />
<a target="_blank" href="http://www.google.com/support/webmasters/bin/answer.py?hl=en&amp;answer=40360">http://www.google.com/support/webmasters/bin/answer.py?hl=en&amp;answer=40360</a><br />
<a target="_blank" href="http://www.robotstxt.org/">http://www.robotstxt.org/</a><br />
<a target="_blank" href="http://www.searchtools.com/robots/robots-txt.html">http://www.searchtools.com/robots/robots-txt.html</a><br />
<a target="_blank" href="http://en.wikipedia.org/wiki/Robots.txt">http://en.wikipedia.org/wiki/Robots.txt</a></p>
<p><strong>Special Downloads:</strong><br />
<a href="http://www.webresourcesdepot.com/?download=jBasket" target="_blank">Ajaxed Add-To-Basket Scenarios With jQuery And PHP</a><br />
<a href="http://www.webresourcesdepot.com/?download=Free-Admin-Template" target="_blank">Free Admin Template For Web Applications</a><br />
<a href="http://www.webresourcesdepot.com/?download=jQuery-Dynamic-Drag-Drop" target="_blank">jQuery Dynamic Drag&#8217;n Drop</a><br />
<a href="http://www.webresourcesdepot.com/?download=sTwitter-1-0" target="_blank">ScheduledTweets</a></p>
<p><strong>Advertisements:</strong><br />
<a href="http://www.admintemplates.com" target="_blank">Professional XHTML Admin Template ($15 Discount With The Code: WRD.)</a><br />
<a href="http://www.xhtmchop.com" target="_blank">Psd to Xhtml</a><br />
<a href="http://www.sslmatic.com" target="_blank">SSLmatic &#8211; Cheap SSL Certificates (from $19.99/year)</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.webresourcesdepot.com/how-to-use-robotstxt-file/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using wincache
Page Caching using wincache (User agent is rejected)
Database Caching 6/12 queries in -3.804 seconds using wincache

Served from: www.webresourcesdepot.com @ 2012-02-12 17:06:50 -->
