This document aims to explain how to diagnose and fix common problems that may result from misunderstanding or misconfiguration when setting up a client/server samhain system. This document is divided in several sections more or less corresponding to the different stages when a client connects to a server. Each section starts with a brief explanation that should provide a basic understanding of what is going on.
This document does not discuss how to setup a client/server (for this, look into the manual and/or the HOWTO-client+server).
Client/server connections are always initiated from the client. The port is compiled in (there is a configure option to change the default). The default port is 49777.
The client reports: Connection refused. The server reports nothing.
The server is down, listens on the wrong port, or network failure.
The client reports: Connection error: Connection reset by peer, and later also Session key negotiation failed. The server reports: msg="Refused connection from ..." subroutine="libwrap".
The server is compiled with libwrap (TCP Wrapper) support, and the client is either in /etc/hosts.deny, or you have set yule: ALL in /etc/hosts.deny, and forgot to put the client in /etc/hosts.allow.
To fix: make proper entries in /etc/hosts.allow and/or /etc/hosts.deny. There is no need to restart/reload the server.
The client has a password that is used to authenticate to the server. This password is located within the binary, and is set with the samhain_setpwd helper application, as explained e.g. in the manual or in the Client+Server HOWTO.
The server has a list of clients that are allowed to connect, and the verifiers corresponding to the passwords of these clients.
Upon successful authentication, client and server will negotiate a session key that is used for signing further messages from the client.
If the password is wrong, the client will report Session key negotiation failed. The server will report: Invalid connection attempt: Session key mismatch
To fix: make sure that the password has in fact been set, that you are using the correct executable for the client (the one where the password is set), and that the entry in the server config file is the one generated for this password (also look out for double entries for this client).
If the client name (as resolved on the server) is wrong, the client will report Session key negotiation failed. The server will report: Invalid connection attempt: Not in client list, and it will tell you in the same error message what name it has inferred for the connecting client (example): client="client.mydomain.com".
The fix depends on the nature of the problem. In principle, it should be sufficient to change the name of the client in the config file entry, which isn't really a solution if e.g. the server thinks the client is 'localhost'.
There are two different ways to determine the client name. Unfortunately, judging from customer feedback as well from common sense, both do not work very well with a messed up local DNS (including /etc/hosts files) and/or überparanoid or misconfigured firewalls (in case of connections across one).
First method: Determine client name on client, and try to cross-check on server
This does not work for a number of people because
If the client uses the wrong interface on a multi-interface machine, there is a config file option SetBindAddress=IP address that allows to choose the interface the client will use for outgoing connections.
If you want to download the config file from the server, you should instead use the corresponding command line option --bind-address=IP address to select the interface.
If you encounter problems, you may (1) fix your /etc/hosts file(s), (2) fix your local DNS, or (3) switch to the second method.
Error messages related to name resolving/cross-checking can be suppressed by setting a very low severity (lower than the logging threshold), e.g.
in the Misc section of the server configuration, if you prefer running unsafe at any speed instead of fixing the problem (you have been warned). Doing so will allow an attacker to pose as the client.
Second method: Use address of connecting entity as known to the communication layer
This has been dropped as default long ago because it may not always be the address of the client machine. To enable this method, use
in the Misc section of the server configuration file. If the address cannot be resolved, or reverse lookup of the resolved name fails, no error message will be issued, but the numerical address will be used.
The client does not tell the server the path to the requested file - it just tells the type of the file, i.e. either a configuration file or a database file. It is entirely the responsibility of the server to locate the correct file and send it.
The server has a data directory, which by default would be /var/lib/yule. Here the config/database files should be placed.
Configuration files: rc.client.mydomain.tld or simply rc (this can be used as a catchall file).
Database files: file.client.mydomain.tld or simply file (this can be used as a catchall file).
If the server cannot access the configuration (or database) file, either because it does not exist or the server has no read permission, the client will report File download failed. The server will report: File not accessible, and it will tell you in the same report the path where it would have expected the file (example): path="/var/lib/yule/rc.client.mydomain.com"
To fix: put the file in the correct location, make sure the permissions are ok.
The server has a table with client names and their session keys. If another client process accesses the server from the same host, it will negotiate a fresh session key for that host. As a consequence, the session key of the first client process will become invalid.
Also, the server keeps track of the status of a client. If a client process does not announce its termination to the server, the server will not expect a startup message, and issue a warning for any such message.
The client reports: Invalid connection state. The server reports: Invalid connection attempt: Signature mismatch. This is a sign that a client has tried to connect using an invalid session key. Most probably, another instance of the client is/was started on the respective host.
To fix: if you need to have concurrent access to the server, suspend the first process with SIGUSR2 before starting the second. Use SIGUSR2 again to wake up the first process. Give the process a second or two to return into the main event loop and go into suspend mode. Do not just use SIGSTOP/SIGCONT: it is important that the client tells the server that it will go into suspend.
The server reports: Restart without prior exit for a client. This is a sign that a client has re-started without informing the server about a previous termination.
This would happen if the client was killed with SIGKILL, or if it terminated within the routine to send a message to the server (the routine is not re-entrant). You may want to investigate messages logged via another logging facility (e.g. the client's local logfile). Of course it may also be a segfault, which would be reported via syslog.