Samhain Labs

file integrity checkers

A comparison of several host/file integrity monitoring programs

By Rainer Wichmann rainer@nullla-samhna.de (last update: Dec 29, 2009)

Caveat: The author of this study is also the author of one of these file integrity checkers (Samhain). I.e. this study is biased insofar as the tests in this study are based on user feedback for Samhain and the authors personal opinion on what basic functionality a file integrity scanner should provide. Which is why Samhain passes all these tests.

If you think some of the results presented here are incorrect or outdated, you are welcome to point out corrections.

The lack of a trademark sign does not imply the non-existence of a trademark.

Content

What is the focus of this study ?
Explanation of table rows
Table of results
Remarks on individual programs
Relative speed
Logging options
Centralized management: A comparison of OSSEC and Samhain
Centralized management: A comparison of Osiris and Samhain

What is the focus of this study ?

There are many reviews which focus on features and tell colourful stories of guys sitting in a server room and watching alerts whooshing over the terminal screen. This one is hopefully different.

This study compares eight free (open-source) host/file integrity checkers (file integrity monitoring programs) with respect to the implementation of the core functionality, i.e. questions like:

Can the program check all files that you may want to check?
Can the program handle quirks / oddities / corner cases of the filesystem (that may e.g. result from normal system activity, from an intrusion, or from errors of shell users)?
Does the program warn about an incorrect configuration (which may cause it to check not in the way you intended)?

The results presented here are based on test runs, and sometimes also on investigation of the source code. All test were performed on a Ubuntu Linux system (6.06 for most programs, 9.04 for OSSEC which was tested end of 2009). In general, tests were performed only with console logging (stdout/stderr). Client/server systems (osiris, OSSEC, Samhain) were tested in a client/server setup with client and server on the same machine.

Thus, while some "features" of these programs are mentioned that may be of interest for usability, the focus of the study was on testing the scanner's functionality, not on listing and/or comparing their features.

Explanation of table rows

Version: The version number of the file integrity scanner.
Date: The release date of the file integrity scanner. For PGP signed source code, this is the date of the PGP signature, otherwise the date listed on the web site, or in the source.
PGP signature: Is the distributed source code PGP-signed ?
If there is no signature, it may be possible to put a trojan into the source code (this has happened in the past with several high-profile security-related programs)!
Language: The programming language of the file integrity scanner.
Required: Requirements (other than compiler or interpreter).
Log Options: What channels are supported for logging?
DB sign/crypt: Does the scanner support signed or encrypted baseline databases?
Conf sign/crypt: Does the scanner support signed or encrypted configuration files?
Name Expansion: Does the scanner support expansion of file names (shell-style patterns or regular expressions) in the configuration file?
Duplicate Path: Does the scanner check the configuration file for duplicate entries of files/directories (possibly with a different checking policy for the duplicate)? Strict checking of the configuration file can help to avoid user errors.
PATH_MAX: Can the scanner handle a file whose path has the maximum allowed length (4095 on Linux)?
Root Inode: Can the scanner handle the "/" directory inode? This is the file with the shortest possible path, and also the only one with a "/" in its filename, so it may expose programming bugs (and you do want to check that inode).
Non-printable: Can the scanner handle filenames with weird or non-printable characters? And if it can handle them internally, can it report results in a useful way? Checked filenames were:

bash$ ls -l --quoting-style=c /
-rwxr-xr-x 1 root root 0 Feb 11 20:16 "\002\002\002\002"

As "\002" is non-printable, incorrect reporting will result in a report about removal of the root directory ("/"), if this file is removed ...

bash$ ls -l --quoting-style=c /opt
-rwxr-xr-x 1 root root 0 Feb 11 19:51 "this is_a_love_song\b\b\b\b\b\b\b\b\bwrong_filename"

As "\b" is backspace, incorrect reporting will result in a report for the non-existing file "this is_a_wrong_filename"

bash$ ls -l --quoting-style=c /opt
-rwxr-xr-x 1 root root 0 Feb 11 19:51 "this_has\n_a_newline"

If filenames are not properly encoded, the newline may easily corrupt the baseline database.
No User: Can the scanner handle files owned by a non-existing user (UID with no entry in /etc/passwd)?
No Group: Can the scanner handle files owned by a non-existing group (GID with no entry in /etc/group)?
Lock: Can the scanner handle files if another process has acquired a mandatory (kernel-enforced) lock on it (yes, Linux has that kind of locks)? It is possible to open() such a file for reading, but the read() itself will block, so the scanner will hang indefinitely, unless precautions are taken. On Linux, mandatory locking requires a special mount option, thus cannot usually be enforced by unprivileged users.
Race: File integrity scanners first lstat() a file to determine whether it is a regular file, then open() it to read it for checksumming. In between these two calls, a user with write access to the directory may replace the file with a named pipe. As a result, the open() call will block and the scanner may hang indefinitely, unless precautions are taken.
/proc: Is the scanner able to scan the /proc directory? On Linux, at least some files in /proc are writeable and can be used to configure the kernel at runtime, so you may want to check these files. However, files in /proc may be listed with zero filesize, even if you can read plenty of data from them. Almost all scanners "optimize" by not checksumming zero-length files, which is incorrect in the Linux /proc filesystem. Additionally, some files may block on an attempt to read from them.
/dev: Has the scanner problems with the /dev directory? Does it allow to check device files (e.g. for correct permissions)?
New/Del: Can the scanner report on missing (deleted) or newly created files?

Table of results (alphabetic order)

	Afick	AIDE	FCheck	Integrit	Osiris	OSSEC	Samhain	Tripwire
Version	2.9-1	0.13.1	2.07.59	4.0	4.2.2	2.3	2.2.6	2.4.0.1
Date	Oct 05, 2006	Dec 15, 2006	May 03, 2001	Apr 19, 2006	Sep 14, 2006	Dec 04, 2009	Oct 31, 2006	Dec 01, 2005
PGP signed	NO	YES	NO	NO	YES	YES	YES	NO
Language	Perl	C	Perl	C	C	C	C	C++
Required		libmhash	md5sum (or md5)		OpenSSL 0.9.6j or newer		GnuPG (only if signed config/database used)
Log Options	stdout	stdout, stderr, file, file descriptor	stdout, syslog	stdout	central log server (email+file on server side)	central log server (email+file on server side)	stderr, email, file, pipe, syslog, RDBMS, central log server, prelude, external script, IPC message queue	stdout, file, email, syslog
DB sign/crypt	NO	NO	NO	NO	NO	NO	sign	sign+crypt
Conf sign/crypt	NO	NO	NO	NO	NO	NO	sign	sign+crypt
Name Expansion	shell-style	regex	NO	NO	regex	ignored files only (regex)	shell-style	NO
Duplicate Path	see remarks	NO	NO	Warns	N/A	Warns	Warns	Exits
PATH_MAX	NO	OK	OK	NO	NO	NO	OK	OK
Root Inode	OK	see remarks	NO	OK	OK	NO	OK	OK
Non-printable	NO	NO	NO	NO	OK	NO	OK	OK
No User	OK	OK	OK	OK	OK	OK	OK	OK
No Group	OK	OK	OK	OK	OK	OK	OK	OK
Lock	Hangs	OK	Hangs	Hangs	Hangs	Hangs	Times out	Hangs
Race	Hangs	Hangs	Hangs	Hangs	Hangs	Hangs	OK	Hangs
/proc	NO	NO	NO	NO	NO	OK	OK	NO
/dev	OK	OK	OK	OK	OK	NO	OK	OK
New/Del	OK	OK	OK	OK	OK	OK	OK	OK

Remarks on individual programs

Afick

Configuration syntax is very similar to AIDE.
Has a GUI (Perl/Tk), which hasn't been tested. Only the command-line application (afick.pl) was tested.
Incorrect configuration (duplicate path with diferent policies) is only detected if "allow_overload := false" is set, which is not the default.
The test for long filenames failed because of an error in the sdbm database store used by afick (apparently, the path was not accepted as key).
Omits checksum if file size is zero, which is incorrect for Linux /proc files.
The baseline database is binary, but Afick has an option to print the contents.

AIDE

Building without libmhash/libgcrypt is possible, but will result in silenty skipping checksums if policy 'R' (read-only) is used, although 'R' is supposed to include md5.
When specifying the root directory, apparently '/.* R' does not match '/'; '/$ R' matches, but only if there is no other rule, so it's useless (i.e. can't check the root directory inode).
There is no tool to list the database (however, it is human-readable, not binary).
AIDE is the only scanner in this study that uses mmap() rather than read() to read a file. This is responsible for passing the 'Lock' test (the kernel denies mmapping a mandatorily locked file).
Judging from comments in the source code, AIDE tries to fix the 'Race' problem, but the solution does not work.
Omits checksum if file size is zero, which is incorrect for Linux /proc files.
For deleted / added files, only the path is printed.

FCheck

Not possible to define different policies (e.g. ignore size change for logfiles).
Omits checksum if file size is zero, which is incorrect for Linux /proc files.
Filenames in baseline database are not properly escaped, thus it is not possible to check files with non-printable characters. Some of them may even corrupt the baseline database (e.g. filenames with newlines).
No check on config file syntax is done. Duplicate entries are scanned twice, mis-typed directives (e.g. 'Directoy =' instead of 'Directory =' are silently ignored).
No tool to dump/read the baseline database, which is barely human-readable.
If a directory is scanned recursively, the top level directory inode itself is never included. Thus it is impossible to check the root inode.

Integrit

You can have only one root directory in the config file, which makes it complicated to scan (only) some directories scattered over the file system. You need to run one integrit instance per root with different (per-root) configuration files.
Internally, all path names start with a double '/'. This may cause the observed ENAMETOOLONG error on valid long paths (?).
Usage is simple and straightforward. According to the documentation, the lack of features is intentional to simplify usage (which certainly is a valid argument, as long as one does not need advanced features).
Judging from comments in the source code, Integrit tries to fix the 'Race' problem, but the solution does not work.
Comment by Ed L Cashin (integrit developer): While there are integrit users who agree with you, I maintain that running integrit three times using three configuration files is cleaner and easier than running integrit once with one more cluttered configuration file. You can even take advantage of parallel I/O if the roots are on different devices.

Osiris

Development seems to have stopped in early 2007.
Files with filename length of NAME_MAX are completely ignored (no database or log entries, except for the eventually modified timestamp of the parent directory).
Omits checksum if file size is zero, which is incorrect for Linux /proc files.
For deleted / added files, only the path is logged.
Osiris applies policies not during the scan, but afterwards (when filtering the logs from comparison against baseline). Thus duplicate entries in the config file are irrelevant.
The database is binary (format is Berkeley DB), and can be dumped to a human-readable format with a tool printdb in the src/tools/, which is not compiled by default (cd src/tools/ && make).

OSSEC

The file integrity monitoring functionality is basic and inflexible, but simple to configure. Only regular files and symlinks are checked, checks are always fully recursive (but one can define ignored files/directories), and it is not possible to define the set of watched file properties.
Installation is interactive, requires root privileges, and is intrusive (creates users, installs init script). Not possible to do a user-level test install without much work.
Apparently the CIS benchmark test (test for robust partition scheme) does not understand mount point entries in /etc/fstab which refer to the partition by UUID rather than device name (this test is part of the rootkit check, not the file integrity monitoting proper).
The manual is short on technical details, and thus there are several undocumented features, e.g.:
- Only regular files and symlinks are checked. Therefore, it is e.g. impossible to check permissions/ownership on character/block devices or directories. E.g. OSSEC cannot detect if the root filesystem device is chgrp'ed to make it read/write for some user.
- Checked are size, permission, uid, gid, md5 and sha1 checksums. File timestamps are not recorded. The timestamp in the baseline DB is the time when the record was entered into the DB.
- Alerting on new files (option alert_new_files) in the configuration file only works if additionally the level of rule 554 is changed and the ossec-analysisd is restarted. The following needs to be added to rules/local_rules.xml:
```
<group name="ossec,">
  <rule id="554" level="7" overwrite="yes">
    <category>ossec</category>
    <decoded_as>syscheck_new_entry</decoded_as>
    <description>File added to the system.</description>
    <group>syscheck,</group>
  </rule>
</group>
```
- It seems that a minimum check frequency of 300 seconds is enforced (though on Linux and Windows there is a 'real-time' option, presumably using inotify on Linux).
XML errors and invalid options in the configuration are detected, but not invalid option values.
By default, OSSEC sleeps for 2 seconds every 15 files, which makes the scan artificially slow. This can be configured in etc/internal_options.conf (option syscheck.sleep). It's still very slow after switching this off (see below).
Files with filename length of NAME_MAX are at first correctly inserted into the baseline DB, but the report is truncated to 825 characters. Fiddling with the file (modify/delete/recreate) corrupts the DB by generating additional records for the truncated name.
Filenames in the baseline DB or in reports are not escaped or encoded, and thus filenames with non-printable characters are reported incorrectly, and can corrupt the baseline database.
For deleted / added files, only the path is logged. The query tool only works for modified files, not for new or missing ones, so for these one has to grep the baseline DB for details.
The database is a simple text file. Each line (record) is: prefix (3 char internal code), size, permission, uid, gid (all numeric), md5, sha1, timestamp of DB entry (NOT file timestamp), file path.
On the server side, ossec-analysisd re-reads the complete baseline database file for every file checked. This doesn't scale well for large datasets, and can cause abysmal performance.

Samhain

Suffers a bit from feature bloat, which causes probably a steeper learning curve than for other programs in this study.
Samhain makes a significant effort to deal correctly with all the pitfalls of a filesystem (such as long path names, filenames with non-printable characters, mandatorily locked files, lstat/open races). The speed test (see below) shows that this can be done without significant impact on performance.
In the 'Lock' test, Samhain will timeout after 120 seconds and report the failure.
In the interest of full disclosure: version 1.8.3 and earlier had a bug with formatting of long reports that caused Samhain to fail on tests with long paths. This was discovered in 2004 when preparing the first version of this study.

Tripwire

Needs gcc 3.4 or earlier for compiling. Will not compile with gcc 4.
Tripwire provides no details about modified/added/removed files, only path names, unless one uses twprint --report-level 4, which is pretty verbose.
Omits checksum if file size is zero, which is incorrect for Linux /proc files.

Speed

The table shows results (averaged over 5 runs each) for initializing the baseline database and running a check. All data are for a 1.2 Gb dataset.

File integrity checking is essentially I/O bound, i.e. most of the time is spent waiting for data from the disk, and thus most of the tested programs run at similar speed.

AIDE is relatively slow when checking, but not when initializing, which hints at a suboptimal implementation of lookups in the baseline database.
FCheck needed almost twice as much time as the others, presumably because it uses an external program, e.g. md5sum, for checksumming, which generates a huge overhead.
Osiris does comparison against the baseline on the server, hence there is no difference between initialize and check.
OSSEC has been tested on a different (newer) platform. To make the results comparable, they have been scaled (assuming linear scaling) by comparing against the speed of the integrit file checker for that platform. The 'sleep two seconds every 15 files' default feature has been switched off, but the speed is still much lower than for any other checker (actual numbers on the test platform: integrit 170s/OSSEC 2126s for init, integrit 129s/OSSEC 3734s for check, on a 2.6 Gb dataset). The reason seems to be on the receiving side, where ossec-analysisd runs at almost 100 per cent CPU usage. It appears that ossec-analysisd re-reads the whole baseline database for each file added to it.
Samhain benefits from heavy wizardry to reduce disk traffic [more specifically, the use of O_NOATIME in the open() call].

In the original study (2004), the file checkers written in Perl showed abysmal performance for a similarly large dataset. This may have been memory related, since Perl has a substantial memory overhead. With the present test machine (which has 1 Gb of RAM), no such problem was observed.

	Afick	AIDE	FCheck	Integrit	Osiris	OSSEC	Samhain	Tripwire
Version	2.9-1	0.13.1	2.07.59	4.0	4.2.2	2.3	2.2.6	2.4.0.1
Initialize	355.5	357.2	654.8	336.0	348.6	4200	282.4	311.4
Check	392.6	435.9	726.9	349.0	348.6	10100	275.1	314.0

Logging options

This is an overview over the logging options provided by different scanners. This information is mostly taken from the documentation, and usually not verified.

Afick: reports are printed to stdout.

AIDE: reports can be printed to stdout, stderr, plaintext file, or to an open file descriptor. Any combination of these can be used, but the verbosity level cannot be set individually.

Fcheck: reports are printed to stdout, and optionally logged to syslog (via the logger standard utility).

Integrit: reports are printed to stdout.

Osiris: osiris clients only send scan results to the central server, which in turn logs reports to plaintext files and can send emails.

OSSEC: similar to osiris, OSSEC clients send scan results to the central server, which in turn logs reports to plaintext files and can send emails. Multiple email recipients with different filters can be defined.

Samhain: Samhain can log to stdout, plaintext file, and syslog. Also supported are: sending reports by email, sending reports to a central server, inserting reports into an RDBMS (MySQL, PostgreSQL, Oracle, or unixODBC), sending reports to a Prelude IDS system, writing reports to a named pipe, calling a user-defined external application to process reports (e.g. to send an SMS to a mobile phone), and providing reports via an IPC message queue.

For each supported logging facility, the level of logging can be configured individually. Any combination of facilities can be used in parallel.

Tripwire: tripwire prints reports on stdout, and stores them in binary files. Optionally it can send reports by email. It can also log to syslog (in a very terse way, where only the number of violations are logged).

Centralized management: OSSEC and Samhain

While both OSSEC and Samhain provide built-in support for centralized logging and management, the general impression is that they are targeted at different user communities.

OSSEC appears to strive to provide a turn-key style solution that tries to avoid intimidating the first-time user:

The installation is done via a rather rigid, interactive installation process that requires root privileges. Basically there is just one way to install it, which leaves little room for user errors.
The installation instructions on the website don't mention PGP signature verification (though there is a PGP signature available). Instead they advise to verify the MD5/SHA1 hashes against a file that is hosted on the same server (which is simpler, but completely insecure — if an intruder modifies the source code on the distribution server, they could change the file with the hashes just as well).
The online manual is short, well-structured and doesn't bother the user with technical details. On the flip side this results in several undocumented features.
The file integrity checking part of OSSEC is very basic and hence easily configured.

However, OSSEC does provide a powerful and flexible rule language for log monitoring, and certainly can be a very powerful tool for that purpose. It's just not the scope of this review to study log monitoring capabilities.

Samhain, on the other hand, provides a lot of functionality and flexibility in respect to file integrity monitoring, at the expense of a steeper learning curve. The installation allows to choose which features get compiled in and where in the filesystem Samhain should be installed. The instructions advertise PGP signature verification of the downloaded code. The Samhain manual is very detailed, strives to document completely the behaviour of the software and certainly takes some time for reading.

Both OSSEC and Samhain support file integrity checking as well as log file monitoring / analysis. However, it is probably justified to say that Samhain is a file integrity checker with added log monitoring capability (which is supposed to be very similar to the capabilities of the tenshi log monitoring program), while OSSEC is a log monitoring application with added file integrity checking capability. In other words, the emphasis on these two aspects is clearly different.

Scalability

OSSEC performs analysis on the server, which means that the server can become a performance bottleneck. The test runs performed indicate that the server does a full read of the baseline database for every file record received from the agent (client). There doesn't seem to be any indexing and/or caching in place. As a result, when checking a large dataset (2 Gb), the ossec-analysisd process was running at almost 100 per cent CPU usage even with a single client only, for more than half an hour on the test machine (Intel(R) Core(TM)2 Duo CPU T9300 @ 2.50GHz). This may indicate that OSSEC might run into trouble with significant numbers of agents and/or significant datasets to verify.

Samhain does the analysis on the client side, and agents just send back reports on detected policy violations to the server. This minimizes both the network load and the computational burden placed on the server, thus preventing it from easily turning into a bottleneck.

Client/server connections

Both OSSEC and Samhain use an encrypted and authenticated connection with pre-shared keys. Samhain uses TCP (default port 49777), while OSSEC uses UDP (default port 1514). The documentation doesn't state whether OSSEC transfers the whole dataset on each file check, but since analysis is done on the server, this appears likely. Samhain clients download the baseline database at startup, but then only report policy violations back to the server.

Additional features

In addition to file integrity checking, Samhain can optionally check for kernel rootkits, check for hidden processes and open ports, search the filesystem for SUID binaries, check mount points (and their mount options), and watch login/logout events. These optional checks run in separate threads and hence do not block each other.

OSSEC can perform rootkit detection, which means it will search for hidden open ports, hidden processes (both similar to Samhain), and file system anomalies (such as incorrect hard link counts for directories, which Samhain will report for monitored directories as part of the standard file system check). The OSSEC rootkit detection also searches for known rootkit filenames or known patterns in rootkit files. The rootkit detection runs in the same thread as the file integrity check.

OSSEC provides 'active response' (e.g. adding firewall rules or modify /etc/hosts.deny). Samhain offers this as well, except that it goes by the name 'external logging facilities'. Both work the same way: you can define external commands which are executed upon certain events. Samhain supplies the event to the called program (so the program can be used for either active response or logging), while the OSSEC manual makes no mentioning of this.

Samhain offers a large choice of different logging facilities (both on the client as well as on the server side) that can optionally be used simultaneously. OSSEC clients report to the central server, which in turn logs reports to files and optionally can send email alerts.

Both Samhain and OSSEC support a central management interface. In the case of OSSEC, this is a set of command-line utilities (CLI) which are part of the OSSEC package. For Samhain, the management interface is a PHP web-based interface that is available as a separate package (beltane).

OSSEC supports MS Windows natively, while Samhain requires a POSIX emulation (like e.g. Cygwin).

Centralized management: Osiris and Samhain

Osiris and Samhain both provide built-in support for centralized logging and management.

Both systems are able to collect reports/data from clients on a central server, and to store baseline databases and client configurations on the central server. Configuration changes and updates of the baseline database can be performed centrally rather than on individual hosts monitored by the system.

General design differences: push vs. pull

Centralized logging and management requires a client/server system where at least one side has to listen on the network for connections, and thus is potentially vulnerable to remote attacks.

For Osiris, scan requests are pushed from the central server to the individual scanner clients. Thus the client, which needs root privileges to open and checksum privileged files, also listens on the network.
On Unix/Linux, this problem is mitigated by using privilege separation (similar to OpenSSH): there is a privileged process that only handles actions that require root privileges, and an unprivileged (sub-)process that does most of the work (including network connections).
However, on MS Windows, privilege separation is not supported.

Samhain works the other way: clients pull the baseline database from the server, and return reports. I.e. here the server has an open port, and the server does not need root privileges. Actually the Samhain server (called 'yule') will only run as an unprivileged user (it drops root privileges if started with), and can be chrooted.

Both Samhain and Osiris use encrypted client/server connections.
With Osiris, (only) the server must authenticate to the client. However, similar to Samhain, Osiris clients negotiate a shared secret with the server that is kept in memory after startup, thus attempts to replace the client can be detected once it has started.
Samhain uses mutual authentication (where the client's credentials are located within the client executable). Upon successful authentication, a shared secret is negotiated that is kept in memory.

With Osiris, clients send back snapshots of the file system, which are compared to the baseline on the server side, and stored in the same location as the baseline database. Thus the server (which is potentially vulnerable to malicious clients) needs write access to the directory where baseline data is stored.

Samhain clients only send back reports on filesystem modifications. These reports can be used to update the baseline database on the server via the central management console. The server only needs read access to the baseline data.

Additional features

Osiris can report if (and which) users/groups have been added to /etc/passwd and/or /etc/groups. Also, it can report on new kernel modules loaded (in a limited way, you can do that with Samhain by monitoring the checksum of /proc/modules).

Samhain offers a large choice of different logging facilities (both on the client as well as on the server side) that can optionally be used simultaneously. Osiris clients only report to the central server, which in turn logs reports to files and optionally can send email notifications.

Both Samhain and osiris support a central management interface. In the case of osiris, this is a command-line interface (CLI) that is part of the osiris package. For Samhain, the management interface is a PHP web-based interface that is available as a separate package (beltane).

Osiris supports MS Windows natively, while Samhain requires a POSIX emulation (like e.g. Cygwin).

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Germany License.

file integrity checkers

Content

What is the focus of this study ?

Explanation of table rows

Table of results (alphabetic order)

Remarks on individual programs

Afick

AIDE

FCheck

Integrit

Osiris

OSSEC

Samhain

Tripwire

Speed

Logging options

Centralized management: OSSEC and Samhain

Scalability

Client/server connections

Additional features

Centralized management: Osiris and Samhain

General design differences: push vs. pull

Additional features

Legal information

Contact

Email