
<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Writing a simple web crawler in Perl</title>
	<atom:link href="http://www.stratos.me/2009/05/writing-a-simple-web-crawler-in-perl/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.stratos.me/2009/05/writing-a-simple-web-crawler-in-perl/</link>
	<description>Just writing what hits my mind!</description>
	<lastBuildDate>Thu, 13 Jan 2011 22:34:15 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
	<item>
		<title>By: Alexandr Ciornii</title>
		<link>http://www.stratos.me/2009/05/writing-a-simple-web-crawler-in-perl/comment-page-1/#comment-6983</link>
		<dc:creator>Alexandr Ciornii</dc:creator>
		<pubDate>Sat, 17 Apr 2010 07:37:11 +0000</pubDate>
		<guid isPermaLink="false">http://www.stratos.me/?p=1156#comment-6983</guid>
		<description>peer: If your program dies on http error, try Try::Tiny module to catch exceptions.</description>
		<content:encoded><![CDATA[<p>peer: If your program dies on http error, try Try::Tiny module to catch exceptions.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: peer</title>
		<link>http://www.stratos.me/2009/05/writing-a-simple-web-crawler-in-perl/comment-page-1/#comment-6000</link>
		<dc:creator>peer</dc:creator>
		<pubDate>Mon, 15 Mar 2010 13:10:56 +0000</pubDate>
		<guid isPermaLink="false">http://www.stratos.me/?p=1156#comment-6000</guid>
		<description>Hi,
i&#039;m writing a crawler in perl but i get http errors like 404 and this breaks my $mech-&gt;get() loop. Did you find a way to avoid http errors to exit your script ?
regards,
peer</description>
		<content:encoded><![CDATA[<p>Hi,<br />
i&#8217;m writing a crawler in perl but i get http errors like 404 and this breaks my $mech-&gt;get() loop. Did you find a way to avoid http errors to exit your script ?<br />
regards,<br />
peer</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: davidak</title>
		<link>http://www.stratos.me/2009/05/writing-a-simple-web-crawler-in-perl/comment-page-1/#comment-2441</link>
		<dc:creator>davidak</dc:creator>
		<pubDate>Mon, 07 Dec 2009 15:05:11 +0000</pubDate>
		<guid isPermaLink="false">http://www.stratos.me/?p=1156#comment-2441</guid>
		<description>Maybe you can modify this for a SQLite Database and a Download with the complete script?</description>
		<content:encoded><![CDATA[<p>Maybe you can modify this for a SQLite Database and a Download with the complete script?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: desiNerd</title>
		<link>http://www.stratos.me/2009/05/writing-a-simple-web-crawler-in-perl/comment-page-1/#comment-2292</link>
		<dc:creator>desiNerd</dc:creator>
		<pubDate>Tue, 28 Jul 2009 09:58:07 +0000</pubDate>
		<guid isPermaLink="false">http://www.stratos.me/?p=1156#comment-2292</guid>
		<description>@stratosg: thanks a lot...I&#039;ve been looking for some simple/sample web spider to get started in Perl/Python...after spending some time what I found is that Perl is best suited for this kinda of job b&#039;coz of the availability of almost anything though CPAN, which you might not get off-the-shelf in case of python. Now I&#039;m clear its THE Perl thats best suited for writing a custom web spider. Your code is really nice and simple. Thanks a lot.</description>
		<content:encoded><![CDATA[<p>@stratosg: thanks a lot&#8230;I&#8217;ve been looking for some simple/sample web spider to get started in Perl/Python&#8230;after spending some time what I found is that Perl is best suited for this kinda of job b&#8217;coz of the availability of almost anything though CPAN, which you might not get off-the-shelf in case of python. Now I&#8217;m clear its THE Perl thats best suited for writing a custom web spider. Your code is really nice and simple. Thanks a lot.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: stratosg</title>
		<link>http://www.stratos.me/2009/05/writing-a-simple-web-crawler-in-perl/comment-page-1/#comment-2216</link>
		<dc:creator>stratosg</dc:creator>
		<pubDate>Sat, 11 Jul 2009 16:21:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.stratos.me/?p=1156#comment-2216</guid>
		<description>@Abishek: Thanks for stopping by!</description>
		<content:encoded><![CDATA[<p>@Abishek: Thanks for stopping by!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Abhishek Sharma</title>
		<link>http://www.stratos.me/2009/05/writing-a-simple-web-crawler-in-perl/comment-page-1/#comment-2211</link>
		<dc:creator>Abhishek Sharma</dc:creator>
		<pubDate>Sat, 11 Jul 2009 13:10:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.stratos.me/?p=1156#comment-2211</guid>
		<description>@Dzieci: I agree with u...
@stratosg: thanks 4 xplainin it so beautifully.... :)</description>
		<content:encoded><![CDATA[<p>@Dzieci: I agree with u&#8230;<br />
@stratosg: thanks 4 xplainin it so beautifully&#8230;. <img src='http://www.stratos.me/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gry Dla Dzieci</title>
		<link>http://www.stratos.me/2009/05/writing-a-simple-web-crawler-in-perl/comment-page-1/#comment-2071</link>
		<dc:creator>Gry Dla Dzieci</dc:creator>
		<pubDate>Thu, 04 Jun 2009 07:37:33 +0000</pubDate>
		<guid isPermaLink="false">http://www.stratos.me/?p=1156#comment-2071</guid>
		<description>Getting perl on windows takes more time than installing linux and learning to use it. Seriously.</description>
		<content:encoded><![CDATA[<p>Getting perl on windows takes more time than installing linux and learning to use it. Seriously.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nihar</title>
		<link>http://www.stratos.me/2009/05/writing-a-simple-web-crawler-in-perl/comment-page-1/#comment-2057</link>
		<dc:creator>Nihar</dc:creator>
		<pubDate>Tue, 02 Jun 2009 16:30:49 +0000</pubDate>
		<guid isPermaLink="false">http://www.stratos.me/?p=1156#comment-2057</guid>
		<description>Great... Now i know why i get lot of spam.

I took some part of bot code from donace and improved the bot to drop entrecards automatically everyday :)</description>
		<content:encoded><![CDATA[<p>Great&#8230; Now i know why i get lot of spam.</p>
<p>I took some part of bot code from donace and improved the bot to drop entrecards automatically everyday <img src='http://www.stratos.me/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: stratosg</title>
		<link>http://www.stratos.me/2009/05/writing-a-simple-web-crawler-in-perl/comment-page-1/#comment-2022</link>
		<dc:creator>stratosg</dc:creator>
		<pubDate>Mon, 25 May 2009 18:39:29 +0000</pubDate>
		<guid isPermaLink="false">http://www.stratos.me/?p=1156#comment-2022</guid>
		<description>@Donace: It can actually work with anything. The only thing we would need to change is the regular expression, (the one that has =~ ) other than that we can extract any kind of info...</description>
		<content:encoded><![CDATA[<p>@Donace: It can actually work with anything. The only thing we would need to change is the regular expression, (the one that has =~ ) other than that we can extract any kind of info&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Donace</title>
		<link>http://www.stratos.me/2009/05/writing-a-simple-web-crawler-in-perl/comment-page-1/#comment-2021</link>
		<dc:creator>Donace</dc:creator>
		<pubDate>Mon, 25 May 2009 18:20:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.stratos.me/?p=1156#comment-2021</guid>
		<description>let say instead of email addresses we search the source code for the comment section and then &#039;rel=follow&#039;. Would that work in theory with some jerry rigging?</description>
		<content:encoded><![CDATA[<p>let say instead of email addresses we search the source code for the comment section and then &#8216;rel=follow&#8217;. Would that work in theory with some jerry rigging?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

