Exploring processes

By Rainer Wichmann rainer@nullla-samhna.de    (last update: Dec 30, 2009)

Introduction

Sometimes something just doesn't work, and you have no clue why. Sometimes it does work, but you don't know what it is - in the worst case, it may be a process started by a malicious intruder. These are the times when you want to explore a process and find out what it does, and/or why it doesn't work.

The /proc filesystem

For each process with pid PID, there is a directory /proc/PID which exposes abundant information about the process, though not all of it in a human-readable format. Among this information, there is:

  • /proc/PID/exe This is a link to the absolute path of the executed program. E.g.:
    root# ls -l /proc/1/exe
    lrwxrwxrwx 1 root root 0 2009-12-30 21:03 /proc/1/exe -> /sbin/init
    	
  • /proc/PID/environ The environment of the process, as null terminated C strings. To replace the terminating null characters with newlnes, you can use "tr '\0' '\n'":
    root# cat /proc/1/environ | tr '\0' '\n'
    ROOTFSTYPE= 
    HOME=/
    DPKG_ARCH=i386
    init=/sbin/init
    ROOTFLAGS=
    ...
    	
  • /proc/PID/cwd This is a link to the current working directory of the process. This might be helpful to find files related to the process.
  • /proc/PID/cmdline This is a file containing the command line used to start the process (add the 'echo' for a newline after the end of the command line string):
    root# cat /proc/1/cmdline && echo
    /sbin/init
    	
  • /proc/PID/fd This is a dirctory with a link for each open file the process has (including pipes and sockets).

strings

The strings program is a simple utility that searches a binary file for printable strings and lists them. Thus, if you have found the actual executable for a process PID via /proc/PID/exe, you could run strings on it to see whether this yields some strings that give further information on the program.

ldd

Never use ldd on an untrusted executable. The ldd utility determines shared library dependencies by executing the program in a special way. It is possible to exploit this with a crafted executable (this is very old, but undocumented in the Linux manpage). Compare the following two commands, which do exactly the same:

$ ldd /bin/true
    linux-gate.so.1 =>  (0xb801e000)
    libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7ea2000)
    /lib/ld-linux.so.2 (0xb801f000)
$ LD_TRACE_LOADED_OBJECTS=1  /bin/true
    linux-gate.so.1 =>  (0xb8006000)
    libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7e8a000)
    /lib/ld-linux.so.2 (0xb8007000)
    

If you're interested in the library dependencies of an executable, you can use instead the following command:

objdump -p /program | grep NEEDED
    

strace

The strace tool monitors the system calls that an application performs, in other words the requests (for file access, network access, ...) that an application makes to the kernel (and the response it gets from the kernel.

This is usually quite helpful for troubleshooting because some of the major reasons why things don't work are:

  • Some file can't be accessed, for whatever reason (e.g. it may not exist, permissions may be incorrect, or it may be on a remote filesystem that is currently unavailable).
  • The application tries to contact a remote machine and fails (network down, remote machine down,...).

You can attach strace to a running process with "strace -p PID", or you can start a process under strace with "strace command". Since the output can be quite voluminous, you may want to redirect it to a logfile with the strace option -o logfile.

If you want to focus on file access only, you can use the option -e trace=file. Likewise, to investigate network access, use -e trace=network.

File access

Usually, for file access you will want to look for system calls like access (do I have access to this file?), lstat and stat (what are the file properties?) and open (open that file for reading or writing!). This may look like:

lstat64("/root/foobar", {st_mode=S_IFREG|0700, st_size=52224, ...}) = 0
open("/root/foobar", O_RDONLY|O_LARGEFILE) = -1 EACCES (Permission denied)
    

Here the program failed to open() the file "/root/foobar" for reading (O_RDONLY) because the required permission is denied. However, the parent directory ("/root") obviously has search permission set, otherwise the lstat() would have failed with EACCES. If a file is missing, the error would be ENOENT. If a remote filesystem is unavailable, the application might hang in an lstat() or open().

Note that it is quite typical that applications have many lstat() and/or open() calls failing with ENOENT (No such file or directory) because many files (libraries under /lib, /usr/lib, language translation files under /usr/lib/locale, /usr/share/locale) might be searched in multiple places. An example would be the following snippet, showing that the file is first searched unsuccessfully in a directory "en_US.UTF-8", followed by a successful attempt in "en_US.utf8":

open("/usr/lib/locale/en_US.UTF-8/LC_MESSAGES", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/lib/locale/en_US.utf8/LC_MESSAGES", O_RDONLY) = 3
    

Successful calls to lstat and stat have "0" as result, while successful calls to open have a numeric file descriptor as result. This file descriptor may later be used in read and/or write calls.

Network access

For network access, you might be interested in system calls like bind (bind to a port - if this fails for a server, it may indicate that the port is already in use by another program), connect (connect to a socket, indicate by AF_FILE, or to a remote machine, indicated by AF_INET). E.g. if you connect to an ssh server, you may see the following:

connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("X.X.X.X")}, 28) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(22), sin_addr=inet_addr("Y.Y.Y.Y")}, 16) = 0
    

Here, the first connect goes to a nameserver "X.X.X.X" (port 53) to query the address of the machine you want to ssh to. The second connect then starts the actual ssh connection (port 22) with the server "Y.Y.Y.Y".

Further interesting calls may be send (send data) and recvfrom (read data sent by the peer). If you are dealing with a running process and want to investigate what it actually does - and the communication is in cleartext -, then this might provide interesting insight.

Case study

Suppose you're doing "netstat -pant" on a machine and find the following unfamiliar entry:

tcp        0      0 X.X.X.X:33205   Y.Y.Y.Y:6667      ESTABLISHED 27141/foobar
    

Port 6667 is IRC, so this program seem to be logged in to an IRC server...

What is this process "foobar", what does it do?

Using the /proc filesystem, we first find that /proc/27141/exe points to a file /home/someuser/.temp/foobar. So in the next step, we run strings on this file, which yields, among a lot of fluff, the following interesting tidbit:

Welcome to iroffer-dinoex - http://iroffer.dinoex.net/
Version 3.17
** iroffer-dinoex is distributed under the GNU General Public License.
**    please see the LICENSE for more information.
    

With the help of Google, we find that iroffer is a program that can be used for file sharing, using something called the DCC protocol. However, it doesn't have a port open for listening, so we check the Wikipedia entry for DCC, and find that it stands for Direct Client-to-Client. What's more interesting, the Wikipedia entry describes that some versions of DCC work in reverse: the file sharing server offers a file in IRC, and the receiver opens a listening socket. The server then connects to that port and sends the file.

So at this point we start an "strace -p 27141" to monitor the process, and in fact within due time we can observe the exchange of IRC communication, followed by our process performing a connect() call to some remote host, as well as an open() call on a file in /home/someuser/.share/, followed by the send()ing of quite a huge amount of data.

How could this happen?

The process is running under the UID of someuser, and a quick look into ~someuser/.bash_history shows a wget command to download the iroffer source code, followed by commands to compile it, edit the configuration, and start it. Interestingly, editing the configuration is done with the nano editor, while the history file shows that someuser only ever uses emacs for editing.

At this point we conclude that an intruder somehow has managed to login as user someuser and has installed the iroffer file sharing program. This usually indicates that the password has been bruteforced or obtained by other means (maybe the user has ssh'ed into the machine from an infected Windows PC). Using the timestamps of the files installed by the intruder, we search the syslog messages for an ssh login, but don't find any. However, the machine has port 5900 open (Remote Desktop, i.e. VNC), so maybe the intruder has managed to login via VNC?

To investigate this, we use "cat /proc/27141/environ | tr '\0' '\n'" to dump the environment of the rogue process, and find environment variables like:

GNOME_KEYRING_SOCKET=/tmp/keyring-GhvCUA/socket
DESKTOP_SESSION=default
GNOME_KEYRING_PID=4957
DISPLAY=:0.0
XAUTHORITY=/home/rainer/.Xauthority
    

The presence of these environment variables shows that the process is inherited from a desktop session rather than an ssh login. So in fact it seems most likely that the intruder has managed to break in via VNC.