To block spam, companies often relied on keyword ‘detection’, and drew up a list of keywords that commonly appeared in most of the spam email. This list would often include keywords such as ‘viagra’ or ‘bank’. However, this method often blocked genuine email and adding more keywords simply resulted in more false positives which in turn blocked legitimate email. But spammers became smarter too, and they addressed keyword blocking by replacing keywords such as ‘viagra’ to ‘v1agra’. 

image email attachment

Spammers then began making use of images to bypass text-based content filtering, simply by no longer using any text content.

In the space of two months, spammers have switched from image spam to using PDF, Excel and ZIP file attachments. By using these attachments to send images instead of embedding them in the body of the email message, spammers have taken the cat-and-mouse game with anti-spam software developers to a new level.

Instead of embedding the image within the email itself, they ‘repackaged’ it within a PDF attachment. This move is clever for a number of reasons:

1. Email users ‘expect’ spam to be an image or text within the body of the email and not an attachment.

2. Since most businesses today transfer documents using the PDF format, email users will have to check each PDF document otherwise they risk losing important documentation.

The use of PDF spam was short-lived as anti-spam software vendors quickly came out with updates and filters that analyzed the body of every PDF file. Not to be defeated, spammers took less than a month to come out with a new option: Microsoft Excel files for push-and-dump scams. This move was clever for reasons similar to those above for PDFs:

1. Email users ‘expect’ spam to be an image or text within the body of the email and not an attachment.

2. Excel is another extremely common file-type in use and users are very familiar with this format. 

3. Since many businesses use Microsoft Excel for spreadsheets, databases and so on, email users will have to check each document otherwise they risk losing important documentation.

Solution – Using keyword detection methods alone will not solve the problem because new spamming techniques have overcome that hurdle. The solution lies in a product that deploys as many anti-spam techniques as possible, including Bayesian filtering and filtering for images/text embedded in different file-type attachments, while at the same time maintaining false positives at a minimum. Full paper from GFi here.