April 24, 2003
On the origins of spam

An interesting study on where spam comes from has a non-surprising conclusion. Email adresses filtered from crawled webpages and newsgroups are the major source of unsolicited email. Whois records seem to be less of an issue. Most whois providers have abuse blocking in place anyway and most registries do not publish zone files to just anyone (zone files are the files listing all the domain names 'taken' within a top level domain). That newsgroups and the web are the main source of data is unsurprising. The connectedness of the web makes crawling a real possibility. In fact, web archives of newsgroups could very well be the main source of newsgroup data also.

A surprising brute force attempt at emailing everything thinkable at some mailserver was also seen. This sounds like a particularly stupid way of searching. The space of just 6 characters wide emails is vast (26^5 ~ 2^28 ~ .25 billion) at every mailserver on the net. This is not like port scanning (i.e. systematically attempting connections on an entire IP range or across all possible services at one address) which is feasible due to the high density of machines on the internet. The space of resolving emailaddresses is very sparse in comparison. But it does tell you that your email needs to be safe against dictionary attacks - like your password - so include special characters if possible.

Posted by Claus at April 24, 2003 02:26 PM
Comments (post your own)
Help the campaign to stomp out Warnock's Dilemma. Post a comment.
Name:


Email Address:


URL:



Type the characters you see in the picture above.

(note to spammers: Comments are audited as well. Your spam will never make it onto my weblog, no need to automate against this form)

Comments:


Remember info?