Ruby Application Restart Behavior

Table of Contents [expand]

Ensuring processes clean up
Why some programs won’t die
at_exit

Last updated December 02, 2024

When Heroku is going to shut down a dyno (for a restart or a new deployment, etc.), it first sends a SIGTERM signal to the processes in the dyno. The full process is documented in Dynos and the Dyno Manager.

This Unix signal provides an opportunity for a process to “shut down gracefully”. The Ruby VM will receive this signal and sends a SignalException to the process, which will interrupt what your process is currently doing so it can clean itself up.

Ensuring processes clean up

How exactly does your process clean itself up? Ruby programs that use the ensure keyword will be activated at this time. An example:

thread = Thread.new do
  begin
    while true
      sleep 1
    end
  ensure
    puts "ensure called"
  end
end

current_pid = Process.pid
signal      = "SIGTERM"
Process.kill(signal, current_pid)

When you run this you’ll see:

ensure called
Terminated: 15

Developers (should) naturally wrap sensitive operations such as deleting temporary files or closing connections in this ensure block. By putting these operations in an ensure block then we’re making it more likely that the program will do the right thing when it exits.

Once all ensure blocks (that are in scope) have been called, the program exits.

After Heroku sends SIGTERM to your application, it will wait up to 30 seconds before sending a SIGKILL to force it to shut down, even if it has not finished cleaning up. In this example, the ensure block does not get called at all, the program simply exits:

thread = Thread.new do
  begin
    while true
      sleep 1
    end
  ensure
    puts "ensure called"
  end
end

current_pid = Process.pid
signal      = "SIGKILL"
Process.kill(signal, current_pid)

The output is simply:

Killed: 9

This is the equivalent of running the famous $ kill -9 command. It is synonymous with CTRL+ALT+DELETE on Windows or a “force quit” on OS X.

Why some programs won’t die

Some programs will never terminate on their own after a SIGTERM. Here’s an example:

thread = Thread.new do
  begin
    while true
      sleep 1
    end
  ensure
    while true
      puts "ensure called"
      sleep 1
    end
  end
end

current_pid = Process.pid
signal      = "SIGTERM"
Process.kill(signal, current_pid)

The output will look like this:

ensure called
ensure called
ensure called
ensure called
ensure called
ensure called
ensure called
ensure called
ensure called

Forever and ever. You can imagine, instead of a while loop, the ensure block is trying to close a connection and the remote server won’t respond. In this case the program would never end, and it would be stuck unable to respond to requests or die.

This is then one way that a program can prevent itself from dying: by doing complicated, long, or impossible to complete tasks in the ensure block.

Even if you’re not using ensure blocks, you’re certainly using gems that do. One of them may be preventing your program from exiting. The way to fix this problem is by sending a SIGKILL to the process, which terminates it immediately.

The other way that your program may prevent itself from halting is if it catches the signal.

Signal.trap('TERM') do
  puts "Never going to die"
end

current_pid = Process.pid
signal      = "SIGTERM"
Process.kill(signal, current_pid)

This Signal.trap block catches the signal and prevents it from doing it’s original work. I don’t know of a good reason for using this functionality and your program probably shouldn’t. If you must trap a signal, it is possible to call the previous behavior Run code when Signal is sent, but do not trap the signal in Ruby. Again this is still not recommended, when your program receives a signal it should exit quickly. It’s also not guaranteed that this will get called. A safer approach is using an ensure block where appropriate.

at_exit

You can also use Ruby’s at_exit to call code when your application is exiting

at_exit { puts "done" }

current_pid = Process.pid
signal      = "SIGTERM"
Process.kill(signal, current_pid)
# => done
# => Terminated: 15

Categories

Ruby Application Restart Behavior

Table of Contents [expand]

Ensuring processes clean up

Why some programs won’t die

at_exit

Feedback