spider_webThis must sound aggressive but it’s 100% true. We all complain nowadays about the amount of spam landing on our inbox. My mailserver’s traffic is more like 80% spam to 20% legit emails. This is a huge problem for any mail provider and, in the end, the end user. So, i decided to make a small experiment. I wrote a small Perl crawler (i might discuss this on a later post) and put it out to collect e-mails. This was a quest to find out how easily can a spammer collect addresses to send their stupid spam to. I decided to make it ridiculously easy. I wouldn’t look for emails “disguised” in one way or another. I would be looking for plain ones. For instance, i wouldn’t be looking for “foo [at] bar.com”, which is a common disguise nowadays, but for “foo@bar.com”.

So, i started coding. About half an hour later i had the little bugger ready. About thirty lines of code. That easy. Now, all i needed, was a place to start searching from. The way it would operate was like a spider. Starting from one site, it would search for emails and then find references to other pages/sites and then visit those doing the same. I decided to go with the most “promising” category, the bloggers, such as me and you people. So, the starting page was a blogging submission catalog. From there the little spider would find it’s way to thousands of blogs.

I fired it up and kept an eye on it for an hour or so, keeping it out of “trouble”. By that i mean pages with no meaning, in depth searching of “nothing” and other creepy things alike. After the hour the results where a bit discouraging for my little experiment. But that was because the spider has just indexed the blog submission directory pretty well and it had just built a base to work on. So, i let it run and went to bed.

The next day the results were stunning. It had run for well over 10 hours. Are you ready for this? It had visited over 100.000 pages and the little bastard had collected more than a 1.000 emails! Can you imagine? One in 100 pages contains a plain text email. One that a little dumb ass robot can pick up! And i meanΒ  business people. The emails where real one alright. The were names of people on Gmail, service and domain emails, support etc. Now, i know that on all the email providers, very sensitive filters are active, but don’t forget, spammers are creative and bypass those filters every day! Go check your email and you will get what i mean πŸ˜‰

I decided to let it run for one more day see if the rhythm would keep up, building a solid spamming base this way. It run for another 24 hours or so. The results? Well, it had visited a total of 370.000+ pages and managed to collect well over 4.000 emails! Yes, impressive i know. The rhythm was a bit elevated by i think that’s within the margin of statistical error. Now, imagine training the robot to understand email forms like the tricky one i mentioned above. The numbers would be much much higher, i bet.

So, i decided to make this post. Do you remember why you shouldn’t forward chain emails? This is a more basic one. Stop adding your email as plain text to your website / blog or you’re doomed to get spam. A good question would be, “how can i add my email then?”. Well, a few pointers i could summarize are:

  • Do not add your email as plain text. Open up paint, add it there and put it in your website as an image. That should cut off the chances of it being picked up by 99%.
  • If you are a company stop having email like “info@mycompany.com” or “support@mycompany.com”. These are easily guessed by spammers and all they have to do is try various combinations out. Change “info” to “information” and “support” to “cust-support”. You get the idea i guess.
  • When giving your email on forums and other social websites make sure you tick the “keep private” if it’s there as an option.

In general, do not post your email publicly, unless you have to. And in that case take special precautions, for instance put it in an image.

I hope this post has got you thinking a little, and next time you start writing your email using your keyboard (and not a pen and paper that is) you will think twice before hitting submit.

Photo credit by jeffsmallwood