How to convert Signsrch/Clamsrch signatures to Yara

This article explains how I converted Signsrch signatures to Yara rules, in order to include them in my tool Balbuzard. Signsrch signatures are useful for malware analysis, to detect standard constants used in many encryption and compression algorithms, and also some anti-debugging code.

I was going to write a parser for Signsrch signatures when I found Clamsrch. This tool includes a python script to convert Signsrch signatures to ClamAV signatures.

And then I found the python script clamav_to_yara.py in the Malware Analyst Cookbook repository.

Thanks to these two scripts, I managed to convert most of the 2000+ signatures from Signsrch to Yara:

About Signsrch and Clamsrch

Signsrch is a great tool written by Luigi Auriemma, to scan files for various kinds of strings and signatures. It includes an impressive database of 3000+ signatures, covering encryption and compression constants among other things.

Clamsrch is a similar tool based on a modified version of the Clam Antivirus engine, in order to obtain a better scan performance than Signsrch. It includes a modified version of the Signsrch signatures database, and a python script to convert that database to the ClamAV signature format.

At the time of writing, Clamsrch was released in July 2012 with a database of 2000+ signatures, while Signsrch was updated with more signatures in 2013.

Step 1 - From Signsrch to ClamAV

First, download Clamsrch from here: I picked "package.01.ful.7z".

The package includes the script clamifier.py, which can convert the signature database sigbase.sig to two ClamAV signature files, clamsrch.ndb and clamsrch.ldb.

These two files are already included in the package, so unless you want to generate them yourself with a new signature database, you can jump to step 2 below.

To run clamifier.py, you will need to download pycrc version 0.7.10. Unzip the file in order to get a subfolder named pycrc-0.7.10 within the clamsrch subfolder. This is necessary because the path is hardcoded in clamifier.py.

Then simply run clamifier.py. After a few seconds, the files clamsrch.ndb and clamsrch.ldb should have been generated.

Step 2 - From ClamAV to Yara

Download the script clamav_to_yara.py from the Malware Analyst's Cookbook repository.

Then run the following command to convert clamsrch.ndb to clamsrch.yara:

>clamav_to_yara.py -f clamsrch.ndb -o clamsrch.yara

###########################################################################
        Malware Analyst's Cookbook - ClamAV to YARA Converter 0.0.1
###########################################################################

[+] Read 2291 lines from clamsrch.ndb

[+] Wrote 2287 rules to clamsrch.yara

The output is a file with 2000+ signatures that can be used with Yara to analyze executable files and libraries, especially for malware analysis.

Using the Signsrch signatures with Balbuzard

These Yara rules are already included in the plugins folder of Balbuzard. Therefore, if you have yara-python installed, Balbuzard will automatically use them when scanning a file. I would recommend at least Yara 2.1.0 which is significantly faster than Yara 1.x.

For example you can test it on the command-line version of 7-Zip available here, which contains several known encryption and compression routines.

Potential improvements

Of course, the conversion is not perfect, and the following issues would need some more work:

  • The Clamsrch database is not as recent as Signsrch, and their formats are slightly different. It would be better to improve clamifier.py to support the same signature format as Signsrch, in order to benefit from the latest updates.
  • All three signature formats (Signsrch, ClamAV NDB/LDB and Yara) have their own features and limitations. Each conversion loses data when specific features are not supported by the output format. For example clamifier.py produces two different ClamAV files (NDB and LDB), whereas clamav_to_yara.py only supports the NDB format. It would be better to avoid the intermediate conversion to ClamAV, by merging the code of clamifier.py and clamav_to_yara.py into a single tool, designed to use Yara features.
  • Performance: according to very quick tests on this 7-Zip sample, Yara 2.1.0 seems to be a little bit faster than Signsrch 0.2.2 when using the converted signatures. Yara 1.6 is much slower (5x). However, Clamsrch is 2 to 3 times faster than Yara and Signsrch. Maybe the signatures can be optimized further to improve performance with Yara.

Please contact me if you found a better solution to convert Signsrch signatures to Yara.