PDF Spam

The spam filtering setup on our server is pretty good – SpamAssassin with Bayesian filtering and the FuzzyOCR Plugin which I installed to deal with the rise of image-based spam last year. Still, a few email addresses that route to me are very public, and most days one or two spam messages get through the filters.

This morning I noticed a new phenomenon in my inbox. I almost moved it across into my “missed spam” folder without giving it a second thought (we train our filters with missed spam to improve the Bayesian analysis), but something caught my eye:

PDF Spam

“That’s odd,” I thought, so I opened the pdf. (Note, in general unless you know what you’re doing, it’s a really bad idea to open attachments if you don’t know the sender or weren’t expecting something from them – it could be a virus.)

PDF Spam

That’s right, it’s spam, in a pdf file. While spamassassin does a great job of analysing text, and even images using FuzzyOCR, no analysis is done of pdf attachments, so this one slipped through the net. (I’ve had seven copies of this so far today.)

What next? Well, if this type of spam continues (and there’s no reason to think it won’t) I expect we’ll see a pdf scanning plugin for SpamAssassin before too long. After that gains traction the spammers will undoubtedly adapt again with some new trick to avoid the filters. Rinse, and repeat.

The arms race continues…

4 Responses to “PDF Spam”

  1. Gravatar Image

    download the scam.ndb.gz and phish.ndb.gz files from
    Http://sanesecurity.com/clamav and it will now catch the pdf stock spams

  2. Gravatar Image

    II really hope you’re right.

  3. Gravatar Image

    Checkout a module I wrote called “PDFassassin” which is a plugin for SpamAssassin.

    This can scan emails for PDF attachments, uses the pdftotext utility to extract the text for Spam messages, and also extracts images and uses OCR to pin-point Spam messages embedded in pictures.

    This plugin can really help prevent the wave of PDF spam messages from hitting your mbox

    Details at:


  4. Gravatar Image

    Looks like the bigger anti spam vendors are up to scratch too: PDF spam – a step ahead of image spam

Leave a Response

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

  • Search
  • Meta

  • Old Browser

    It seems you are using an old web browser (e.g. Internet Explorer 5 or below). This is a security risk to you, since Microsoft no longer releases updates for old versions of Internet Explorer. Also, note that this site is designed to modern internet standards, and the layout may appear strange or plain in older browsers. All the content is still accessible to you, but I strongly recommend you upgrade to a modern, safe, standards-complient browser, such as Firefox. For more information on getting the best experience surfing the web, see browsehappy.com.