Friday, September 08, 2006

Partial file matching in host intrusion prevention systems

A few weeks ago Jesse Kornblum released the SSdeep [1] tool. The main propose of this tool is to identify similar files by calculating hashes and comparing those hash values to the known values computed previously and stored in database.
he big difference between SSdeep and other well-known tools to generate hash values (like md5sum) is that SSdeep calculates hash values for small chunks of target file. So if someone modifies only few bytes of a file, the new calculated hash value will be similar to the previous one [2].


The main features (altered document matching and partial file matching) of the SSdeep are very helpful during forensic analysis. But you can also use partial file matching feature in host intrusion prevention systems. You can use this function to disable execution of specific programs.

It was rather useless to use “normal” hash values in IPS to prevent execution because such solution can be easily cheated. After changing even one byte in executable file the new hash value is completely different. As you can guess there are a lot of places in an executable file which can be modified and the exe file still will be executed without any problems.

Before one byte modification:

C:\ssdeep>md5sum gg.exe
\0323de930ed3e8e0552843db7e16dab7 *C:\\ssdeep\\gg.exe

After one byte modification:

C:\ssdeep>md5sum gg.exe
\bfa5aed4078c2a316786c1e7cb1e4f8e *C:\\ssdeep\\gg.exe

As mentioned above the SSDeep generates has values for small blocks of target file. So few modifications of target file will not change the whole value of generated hash as it is presented below:

Before modifications:

C:\ssdeep>ssdeep -l gg.exe

C:\ssdeep>ssdeep gg.exe > sum.txt
C:\ssdeep>ssdeep -m sum.txt gg.exe
C:\ssdeep\gg.exe matches C:\ssdeep\gg.exe (100)

After few modifications in .rsrc section:

C:\ssdeep>ssdeep -l gg.exe

C:\ssdeep>ssdeep -m sum.txt -p gg.exe
C:\ssdeep\gg.exe matches C:\ssdeep\gg.exe (91)

Additionally, the percentage value of similarity is generated (value in brackets).
By setting the value to 60 or 70 we can implement quite effective method of blocking execution of specific files.

This solution could block the execution of particular program and even new versions of it because very often new releases are based on the previous one.

Useful links: