Detecting failures caused by process limits on CentOS 6

Sommaire

On Linux, resources limits are there to ensure that a single user will not saturate the server resources. These limits can cause silent failures. The best way to detect these failures is to use the auditd daemon, which can catch the result of individual system calls.

To do this, we need to know which system calls are used and which result needs to be catched. We are going to use the strace command to find out the system calls and the return codes to catch. The strace utility will follow a command and display all system calls and their results.

Too many open files

Opening a file

Let’s start by using gawk to create 1024 files, without closing any of them (which is really a bad idea):

BEGIN {
  # Creating 1024 files
  for ( I=1 ; I <= 1024 ; I++ ){
    FILE_I = "test/" I
    print "TEST" > FILE_I
    # No close( FILE_I )
  }
}

Let’s save this short program in a file named “nofile.awk”, then start it using strace:

mkdir -p test
strace gawk -f nofile.awk 2> nofile.log

The nofile.log file is quite big. Near the end, we find:

open("tmp/1021", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 1023
fstat(1023, {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
fcntl(1023, F_SETFD, FD_CLOEXEC)        = 0
fcntl(1023, F_GETFL)                    = 0x8001 (flags O_WRONLY|O_LARGEFILE)
fcntl(1023, F_GETFL)                    = 0x8001 (flags O_WRONLY|O_LARGEFILE)
fstat(1023, {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc140450000
lseek(1023, 0, SEEK_CUR)                = 0
ioctl(1023, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7ffea301dcd0) = -1 ENOTTY (Inappropriate ioctl for device)
open("tmp/1022", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 EMFILE (Too many open files)

We see that the system call open has received the EMFILE return code (“Too many open files”). Translating this as an audit rule, we have:

-a always,exit -F arch=b64 -S open -F exit=-EMFILE -k nofile

Opening a network socket

The maximum number of open files also applies to network sockets. Let’s try to reproduce this error with a gawk program opening 1024 UDP connexions:

BEGIN {
  # Opening 1024 UDP connexions
  for ( I = 1 ; I <= 1024 ; I++ ){
    ADDRESS = "/inet/udp/0/127.0.0.1/" (1024 + I) 
    print "TEST" |& ADDRESS
  }
}

After running this short program under strace, we find the same error on a dup call. The dup call is used to duplicate a file descriptor.

socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP) = 1023
bind(1023, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
connect(1023, {sa_family=AF_INET, sin_port=htons(1535), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
fstat(1023, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
fcntl(1023, F_SETFD, FD_CLOEXEC)        = 0
fcntl(1023, F_GETFL)                    = 0x2 (flags O_RDWR)
fstat(1023, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0fe056f000
lseek(1023, 0, SEEK_CUR)                = -1 ESPIPE (Illegal seek)
dup(1023)                               = -1 EMFILE (Too many open files)
close(1023)                             = 0

We can now prepare a new revision of our audit rule, which will check calls to open and calls to dup:

-a always,exit -F arch=b64 -S open -S dup -F exit=-EMFILE -k nofile

This error could also happen opening a socketi (with the socket system call). Let’s open a file at the start of our script. This should move the operations on file descriptor of 1 step (each call of socket is followed by a call of dup).

BEGIN {
  # Opening a file (never closed)
  print "TEST" > "test1"
  # Opening 1024 UDP connexions
  for ( I = 1 ; I <= 1024 ; I++ ){
    ADDRESS = "/inet/udp/0/127.0.0.1/" (1024 + I)
    print "TEST" |& ADDRESS
  }
}

After running this script under strace, we see that, this time, the error occurs on a socket call. We see that gawk tries different methods of opening a notwork socket before giving up:

socket(PF_NETLINK, SOCK_RAW, 0)         = -1 EMFILE (Too many open files)
socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = -1 EMFILE (Too many open files)
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = -1 EMFILE (Too many open files)
socket(PF_NETLINK, SOCK_RAW, 0)         = -1 EMFILE (Too many open files)
socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP) = -1 EMFILE (Too many open files)
socket(PF_NETLINK, SOCK_RAW, 0)         = -1 EMFILE (Too many open files)
socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP) = -1 EMFILE (Too many open files)
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({1, 0}, 0x7fffeb2c3570)       = 0

Which translates as the following detection rule:

-a always,exit -F arch=b64 -S socket -F exit=-EMFILE -k nofile

Audit rules for “Too many open files”

After some more reading of man pages, we get the following rules, covering all known system calls:

-a always,exit -F arch=b64 -S open -S openat -S dup -F exit=-EMFILE -k nofile
-a always,exit -F arch=b32 -S open -S openat -S dup -F exit=-EMFILE -k nofile
-a always,exit -F arch=b64 -S socket                -F exit=-EMFILE -k nofile
-a always,exit -F arch=b32 -S socketcall -F a0=1    -F exit=-EMFILE -k nofile

We have added the openat system call (opening a file in a specified directory) and the 32-bits system calls.

Too many processes

Second classical failure caused by limits: too many processes. Let’s reproduce this error:

BEGIN {
  # Launching 1024 yes processes
  for ( I = 1 ; I <= 1024 ; I++ ){
    # We add a different comment to each call
    # to trick Gawk into launching separate
    # processes
    COMMAND = "yes #" I
    COMMAND | getline
  }
}

We save the program as nproc.awk, and lanch it under strace:

ulimit -u 512
strace gawk -f nproc.awk 2> nproc.log

Note: we reduce the maximum number of processes to 512, to avoid failing first for opening too many files…

After running the script, we check the log file, and find a failed clone call, with the return code EAGAIN (“Try again later”):

clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f13af5b49d0) = 3549
close(978)                              = 0
fcntl(977, F_SETFD, FD_CLOEXEC)         = 0
ioctl(977, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7ffd6bb37d50) = -1 EINVAL (Invalid argument)
fstat(977, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
read(977, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 4096) = 4096
pipe([978, 979])                        = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f13af5b49d0) = -1 EAGAIN (Resource temporarily unavailable)

As before, with some additionnal man pages browsing, we can deduce the audit rules:

-a always,exit -F arch=b64 -S clone -S fork -S vfork -F exit=-EAGAIN -k nproc
-a always,exit -F arch=b32 -S clone -S fork -S vfork -F exit=-EAGAIN -k nproc

The final audit rules

To conclude, here are the audit rules to use to detect both kind of failures (too many open files, too many processes):

-a always,exit -F arch=b64 -S open -S openat -S dup  -F exit=-EMFILE -k nofile
-a always,exit -F arch=b32 -S open -S openat -S dup  -F exit=-EMFILE -k nofile
-a always,exit -F arch=b64 -S socket                 -F exit=-EMFILE -k nofile
-a always,exit -F arch=b32 -S socketcall -F a0=1     -F exit=-EMFILE -k nofile
-a always,exit -F arch=b64 -S clone -S fork -S vfork -F exit=-EAGAIN -k nproc
-a always,exit -F arch=b32 -S clone -S fork -S vfork -F exit=-EAGAIN -k nproc