Currently Browsing: Quickies

Quick URL parsing using Perl

Have you ever stumbled on a page that you would like to copy-paste all the links from and, darn, they were many? For instance when you get that directory listing and you want to download all the files? Well, i faced the problem these days. I wanted to copy more than 30 consequtive links from a directory listing and i thought that it’s plain stupid doing it by hand. Now, the first thing that popped into my mind was a FireFox plugin. I started looking here and there and every one of those had something i didn’t like. So, then it came to me. I would copy the page source and then use a little Perl script to extract those links. Sounds hard? Well it’s not since Perl is the right thing for this job. So, a little tinkering here and there and this is what i came up with.

  1. #!/usr/local/bin/perl
  2. package MyParser;
  3. use base qw(HTML::Parser);
  4. $prefix = "http://a_site_do_use.com";
  5. sub start {
  6.         my ($self, $tagname, $attr, $attrseq, $origtext) = @_;
  7.         if ($tagname eq ‘a’) {
  8.                 print $prefix.$attr->{ href }."\n";
  9.         }
  10. }
  11. package main;
  12. $file = "urls.txt";
  13. open(URLS, $file);
  14. @lines = ;
  15. close(URLS);
  16. $html = "";
  17. foreach $line (@lines){
  18.         $html .= $line;
  19. }
  20. $parser = MyParser->new;
  21. $parser->parse( $html );

A quick explanation of it is this. We use the HTML parser that Perl brings in. It’s a pretty nifty tool. If you want to use this as is you need to check out two things. One is that the contents of the page to extract the links should be on a file “urls.txt” and the second is that if the URL’s are relative (just like the ones that apache produces on a directory listing) you need to add the full prefix on the “$prefix” variable. If you want to tweak it be my guest. It’s draftly written anyway. If you don’t feel comfortable with code then go for those plugins. They are pretty good. Just not for me.

So, i hope this helps out for you guys as it surely did for me!

PS: I know it can be written more effectively but it works so i’m done tweaking :)

Small hiccup on the “Featured” post

A few days back i posted on how to create a small place on top of your posts in your theme to keep a featured post there (just like the one running here, as you can see on top of the posts). Validating my blog today against different browsers i stumbled on a serious mistake i did. On that tutorial there is a line of code saying:

  1. <?=substr($featured->post_content, 0, 500);?> []

That has a serious bug that’s lurking around not ready to be found at all. This line shortens the post to the first 500 characters so it will appear like an excerpt. But here is the problem. When using a “more” tag to cut your posts the following text is inserted in the post:

  1. <!- – more – ->

The problem is when the cutting of the text stops just after the:

  1. <!- -

There is the problem. The above character sequence is the markup for HTML indicating that the text following is comments! There, the parsing stops. So, after that point your page will not render. That is a major problem as you can imagine. I have a fix for it though. Please replace the above line of code with this one:

  1. <?=substr(str_replace(‘<!–  more –>’, , $featured->post_content), 0, 500);?> []

This code makes an extra callto the str_replace function replacing the more tag with the empty string. This way, there is no way a problem like that can occur. Hope it didn’t happen to you guys because it’s scary! If you encounter any more problems please report them back to me!

Get rid of the Sociable plugins. Do it yourself!

While i am at the topic of WordPress tweaks here is another one. Since i started coding plugins, i keep encouraging people to minimize their plugin needs. They are resource hungry and may cause problems due to security issues on your blog. There are those that agree to that but say that on the other hand there are those people that are not comfortable and familiar with code and cannot do changes on their own. Thus, they need some plugins to do the job, although the task at hand could be coded easily. As i think they are right at that point, i decided to write this small article on how to get rid of the sociable plugins. Those are the ones that add a small segment on the end of each article that have links for digging, stumbling or subscribing. I think that these are the most easy ones to code on your template. That way you can uninstall them making your blog a bit lighter. Here is how.

What you will need is to edit two files on your theme, single.php and style.css. The first one is the script that tells WordPress how to render an article and the second one is the file that tells the browser how to color/place or show each element on your site. Go to “Design -> Theme Editor -> Single Post (single.php)”. Opening that file, locate the line that says:

  1. <?php comments_template(); ?>

There is where your theme inserts the comments and the comment form below your post. Just above that is where you want to add that sociable block. Here is mine:

  1. <div class="liked_article">Did you find this article useful? Did you like it? Why not <a href="http://digg.com/submit?phase=2&amp;url=<?= the_permalink();?>&amp;title=<?= the_title();?>">Digg</a> it or <a href="http://www.stumbleupon.com/url/<?= the_permalink();?>">Stumble</a> it. Maybe you found it to be <a href="http://del.icio.us/post?url=<?= the_permalink();?>&amp;title=<?= the_title();?>">Delicious</a>? Even better, why don‘t you <a href="<?= bloginfo(‘rss2_url‘);?>">subscribe</a> to my feed so you can get the latest from this blog. If you prefer you can subscribe to this <?= comments_rss_link(‘article‘s comments’);?> feed.</div>

As you can see it’s the text that shows up on the bottom, included in the dashed box. if you like, you can change it to whatever you want. Now, it’s time to edit style.css. Go to the bottom of the file and add this block:

  1. div.liked_article {
  2.     background:#FFFFDD;
  3.     border:1px dashed black;
  4.     margin:10px 0;
  5.     padding:10px 5px;
  6. }

What this does is actually tells the browser to make a dashed box around it, that has a 10 pixel spacing from the top and bottom content (this is the margin) and write the text 10 pixels from the top and bottom dashes and 5 pixels from the right and left (that is the padding). That is all! Save the files and visit an article of yours. Be sure to delete the cache, if you have it enabled, so the changes can take effect immediately.

You still can’t do it? Well, if you are willing to add a small link back to my site to your blogroll or ad area then mail me (at the email found in the about us page) these two files and i will do it for you! If you want more changes and require some assistance with your blog then consider hiring me. For any problems do not hesitate to come back with comments!

Delete post revisions without any plugin

You know i’m all for “If you can do it yourself then don’t use a plugin”. That goes, without saying, to my plugins too. So, i saw a plugin recently doing a very useful thing, cleaning up post revisions. But what is a post revision? Well, it’s a safety measure the WordPress team took since version 2.6.X to protect your writing from unfortunate circumstances that may occur. Moreover, it’s a way to have different versions of your posts. When you write you tend to delete and start over. At some point you might want to go back and see what your post looked like back then. Pretty much this is what post revision is.

But, when you are done with a post, you most probably will not use that ever again. So, think about having ten versions of a certain post, or even worse, ten versions of many posts. That’s a lot of unnecessary info and clutter for your database. To give you the whole picture, post revisions rest in the table that your actual published posts are. This is bad in two ways. One, it’s making it harder to index and make a query on the table. Even worse, when a query joins this table with another one then the result of the join will have a lot of junk lines, therefore, alot of no needed info. Two, imagine a post replicated ten times on your database. It could be like 60Kb or even, much more. Now that is alot of fragmented space. For this reason it’s a very good idea to remove post revisions.

The plugin can do the job for you but, as i already said, this is a trivial job and you can easily do it by hand. All you need to do is run a simple SQL query on your database and you will be done. Before going any further please make sure you keep a backup of your database for the scary moment that something goes wrong. Open your sql manager (either console or phpmyadmin or whatever you use to access your database). Run this query:

  1. DELETE FROM wp_posts WHERE post_type="revision";

You are done! Please make sure you type exactly this and nothing less or you might be in serious trouble. Now, just to improve things for you, you might want to delete all revisions that are older than a month. This is what you are looking for:

  1. DELETE FROM wp_posts WHERE post_type="revision" AND post_date="2008-11-18 00:00:00";

This deletes all revisions that are older than a week today. So, go ahead and get rid of all the clutter. When i did so, i reduced my database’s size by 1.2Mb! I know, impressive huh? Keep one thing in mind though, backups, backups and more backups. The more the better ;)

Brand your Windows installation

When buying a new computer you must have all noticed that, on the “My computer” properties, there is a logo and technical support information of the vendor. On the left side of the system info, on the tab labeled “General”, there is the company’s logo. On the bottom there is a button called “Support information”. Clicking on it opens a new window with all the technical info about the vendor. Maybe a few phone numbers and addresses. These kind of information are called “OEM info” (Original Equipment Manufacturer). Have you ever wondered where is that info located? Have you ever wanted to do a similar thing on a pc you formated or assembled? Here is how…

(more…)

« Previous Entries Next Entries »