Posts

Ruby exit, exit!, SystemExit and at_exit blunder

Recently, we hit a problem with Ruby’s “exit” command. If something went horribly wrong and it made no sense for our application to continue in its current state then we would abort with “exit 1″. We use supervisord to manage processes, so in this case when we exited with exit status of 1, supervisord would assume something went wrong and restart the process for us. Or at least that is what we thought…

“exit 1″ does not actually cause the process to exit, it just raises a SystemExit exception. This is very clearly explained in the documentation.

begin
  exit
  puts "never get here"
rescue SystemExit
  puts "rescued a SystemExit exception"
end
puts "after begin block"

In our case, nothing was catching the exception. So what was it?

The other way to handle exits is to run an “at_exit” block. Ruby runs any “at_exit” block when it gets an “exit” call, but does not run these with a “exit!” call.

Here is what our rather naive at_exit block looked like…

at_exit do

  # do cleanup

  # now exit for real
  exit

end

This harmless looking piece of code was turning our “exit 1″ into an “exit 0″. When supervisord sees one of its processes exit with a status of zero, it assumes all is good and does not try to restart the process. This is big problem for uptime and a major reason for using a tool like supervisord.

Instead we should be using “exit!” if we want to die hard [with a vengeance] and be sure that the process exited with a status of 1 and is restarted by supervisord. This would bypass all SystemExit rescue blocks and at_exit blocks.

Alternatively, and a better solution, is that we never call “exit” inside an “at_exit” block and we make sure that all SystemExit rescue blocks and at_exit blocks are used with great caution and echo the original exit status when necessary.

Update

Jonathan Rochkind made a really great clarification of the exit / at_exit situation in a comment below. I think it is important to read it.

Comments

  1. Avdi Grimm

    A good reminder; not enough people understand Ruby’s process termination system. Incidentally, I prefer `abort(“Some error message”)` to `exit(1)`. It sets the exit status to 1, and also sets the error message in the SystemExit exception.

    1. Deryl R. Doucette

      Yes, did not know about the abort() either! I’ve been hardcoding ‘exit 255′ in my scripts. Thanks!

  2. Jonathan Rochkind

    at_exit blocks are used with great caution and echo the original exit status when necessary.

    Any clue on how one would do this, is there any way an “at_exit” block can access the ‘original exit status’?

    Or wait, is the key thing just that you do not need to and should not call `exit` inside `at_exit` — at_exit shoudl do it’s thing without calling `exit`, and then when it’s done being executed the process will still exit on it’s own, with the original exit value. Is that right?

    1. Avdi Grimm

      You can always find out what the exit status is by looking at the SystemExit exception, which will be found in `$!` (aka `$ERROR_INFO` if you require ‘English’). If I may shamelessly self-promote for a moment, I go into this in depth in Exceptional Ruby.

  3. Rick Hull

    > Yes, I think putting a “exit” inside an “at_exit” block makes no sense

    Pardon the phrase, but this is the real WTF. Maybe at_exit good practices would be a good blog post. I’ve never had to use it so I’ve never looked into it myself. Also, I wholeheartedly recommend Avdi’s exceptional book.

  4. David Barri

    What a coincidence! I just blogged about something very similar!
    https://japgolly.blogspot.com.au/2012/09/problems-with-atexit-and-exit-and-rspec.html

    Nice.

  5. Jonathan Rochkind

    After running into my own weird at_exit related bug, I discovered this MRI bug report:

    https://bugs.ruby-lang.org/issues/5218

    I think a LOT of people’s at_exit related problems are actually due to this bug. Note that while the bug is marked fixed in the tracker, the reproduction still reproduces for me in 1.9.3p194, so I don’t think it’s made it into a release yet.

    And that bug report reproduction case also reveals something interesting — you ARE allowed to call `exit` in an `at_exit` block. It does not keep subsequent `at_exit` blocks from running. If everything is working properly, the exit code of the last `at_exit` block to call `exit` ‘wins’. But because of that bug, everything may not be working properly. There is a workaround with monkey-patch redefining `at_exit` in that bug report.

    That bug doesn’t actually account for YOUR problem as above. Your problem was a legitimate software error — don’t call ‘exit’ in an `at_exit` block unless you actually WANT to set the exit code in the `at_exit` block. If you don’t, and don’t need to call `exit` in it, you’re fine. If you DO need to call `exit’ in an at_exit block to set the exit code… then the MRI bug may interfere with your desires.