Weaponized MS Office 97-2003 legacy/binary formats (doc, xls, ppt, ...)

This article describes the Microsoft Office 97-2003 legacy/binary file formats (doc, xls, ppt), related security issues and useful resources.

The original location of this page is http://www.decalage.info/file_formats_security/office.

Last update: 2014-11-19 (created 2010-03-08)

File format description

MS Office binary formats are widely used:

Except for very old MS Office versions, all these formats share the same basic container structure, either called OLE2, OLECF, structured storage or compound file/document.

MS Office also contain other applications such as MS Access which use different file formats not based on the OLE2 format.

Since MS Office 2007, new file formats based on XML (docx, xslx, pptx) are used by default. See the article about MS Office Open XML.

Main client applications

The main applications used to open MS Office files are part of the MS Office suite:

Many alternative applications are also able to open MS Office files, such as OpenOffice, StarOffice, GNOME Office and KOffice.

Main security issues

Format specifications and technical information

Specifications for the OLE2 Compound File format:

Specifications for the MS Office legacy formats:

Publications about MS Office formats security issues

Examples of known vulnerabilities and exploits

Analysis Techniques

Useful Analysis Tools

Parsing tools and libraries

Filtering tools and libraries