Exception handling in LDC using LLVM

Exception handling is an integral part of the D programming language. Naturally LDC, aiming to be a complying compiler, needs to provide it. Here I describe how exactly user code, generated LLVM IR, the unwinding library and the LDC runtime interact to make it all work – at least on x86 Linux.

There is some documentation on exception handling with LLVM and the pages linked from there contain further information, in particular the details on the unwinding runtime. Unfortunately, examples of actual use are hard to find, so trial and error has played a major role in learning the workings of LLVM EH. I’ll try to present a complete example here, but will assume you’ve at least skimmed through both documents.

First, the throw statement. Its basic job is simple: invoke the exception handling runtime by calling _Unwind_RaiseException with the address of an _Unwind_Exception struct. This struct contains, among some private data, an eight-byte exception class to identify the language and vendor it originates from (for LDC we set it to “D1\0\0″ and “LLDC”) and a cleanup callback. Since it is necessary to communicate the exception object that is being thrown to the handler code, this struct is embedded in a larger one. Later, the address of this surrounding struct can be computed from the address of the unwind_info member.

Consequently, the outer struct looks like this

struct _d_exception {
  Object exception_object;
  _Unwind_Exception unwind_info;
}

and the code to invoke the unwinding runtime is straightforward:

void _d_throw_exception(Object e) {
    if (e !is null) {
        _d_exception* exc_struct = new _d_exception;
        exc_struct.unwind_info.exception_class[0..4] = "LLDC";
        exc_struct.unwind_info.exception_class[4..8] = "D1\0\0";
        exc_struct.exception_object = e;
        _Unwind_RaiseException(&exc_struct.unwind_info);
    }
    abort();
}

What happens on a throw is essentially the following:

  • _Unwind_RaiseException traces the stack by looking at the unwind tables and, for each frame that has a landing pad set up, calls a ‘personality function’, asking it whether it can handle the exception object.
  • Once one is found, it traces the stack again, this time telling the personality functions to execute the code in any intervening finally blocks.
  • In the end, it calls the function for the final landing pad with arguments indicating that control is to be transferred to the catch handler.

Luckily, exception handling in D can be implemented using only a single personality function for all landing pads. This personality function decides what to do for each individual landing pad by parsing the language specific area of the unwind data. This area contains three tables: the callsite table, the action table and the classinfo table.

  • The callsite table maps instruction address ranges to indices into the action table. These address ranges mark the beginning and end of the code in a try block.
  • The action table contains chains of indices into the classinfo table and values that will be used to identify the action to the handler code later. An action corresponds to a catch or finally block.
  • The classinfo table holds the addresses to the classinfos of each class used in a catch parameter.

When the personality function is called and given the context of a certain landing pad, it looks up the instruction pointer, finds the right entry in the callsite table and then walks the corresponding action chain. For each possible action, it checks whether the thrown exception object is derived from the class specified by the respective classinfo. Once a match is found, it knows that this landing pad is responsible for the exception. When it is called again with instructions to transfer control to the handler, the personality function passes the exception object and the index from the action table to the hander code.

If you’re interested in the code that accomplishes this, take a look here.

The last step required to make EH work is to provide the handler code and to write out the correct unwind tables. Let’s look at some user code and what it is essentially turned into by LDC (of course the actual output is LLVM IR). The situation grows considerably more complex when there are nested try-catch-finallys in the same stack frame, but I hope this snippet illustrates the basic ideas.


try
{
  code_try();
}
catch(ExceptionClass ec)
{
  code_catch(ec);
}
finally
{
  code_finally();
}

// this is an invoke with
// 'handler' as exception target
code_try();
goto end;
 
handler:
 
ehptr = llvm.eh.exception();
ehsel = llvm.eh.selector(
    ehptr,
    &_d_eh_personality,
    ExceptionClass.classinfo,
    0);
 
switch(ehsel)
{
  case 1:
    code_catch(ehptr);
    goto end;
  default: // ehsel == 0
    code_finally();
    _Unwind_Resume(&ehptr.unwind_info);
}
// unreachable
 
end:
code_finally();

The llvm.eh.* intrinsics get the exception object and the action table index that are passed in by the personality function as mentioned above. But there’s more going on here: the selector intrinsic also tells LLVM what the data in the unwind tables should be. In particular, the personality function and the exception classinfos are set here. The zero indicates the finally block. The call to code_try() has been turned into an invoke, which makes LLVM emit an entry in the callsite table for it.

As you can see, the unwinding runtime and LLVM code generator are tied closely via the two intrinsics and thus supporting other runtimes such as Windows structured exception handling will be nigh-impossible without changes to LLVM. Hopefully, getting llvm-gcc to support exception handling on Windows will be enough of an incentive for the LLVM team to provide that feature eventually.

Another thing to bear in mind is that LLVM’s exception support is, at the moment, very C++ specific. The code generator can fill the language specific data area only with the three C++ style tables mentioned above. Fortunately, D’s exceptions are similar enough that we can get the right behavior by inserting suitable values into these tables.

For now, the implementation in LDC has only been tested on x86 Linux, though the PowerPC target should work as well. EH on x86-64 Linux will supposedly be enabled in the next LLVM release. The remaining issues should be solved as LLVM matures, enabling LDC to provide correct exception handling support on more platforms.

Comments 7

  1. Denis Koroskin wrote:

    Good job!

    Could ypu post more notes on LLVMDC status, please? This makes tracking the progress alot easier. It is interesting to know at what point you are now and what are the plans for future iterations. More posts also means more digg-it’s, redd-it’s and Google-it’s!

    Posted 03 Oct 2008 at 21:41
  2. Christian wrote:

    The simplest way to stay up to date on LLVMDC development is to visit our IRC channel. I generally only write about progress here if I feel it’s worth it.

    That said, I will put up a post when the slides and video of our talk at the Tango Conference are released. I’ll also think about doing monthly status update posts.

    Posted 04 Oct 2008 at 9:34
  3. Denis Koroskin wrote:

    Nice. Hope to see first public LDC release alongside with some juicy tests comparing its performance/optimizations against DMD :) Too bad the LLVM2.4 release is delayed for a week…

    Looking forward for a big interesting article about the progress, roadmap etc. once it is finally released!

    Posted 28 Oct 2008 at 12:38
  4. Clay Smith wrote:

    I think I’d be interested in giving LLVMDC a spin once it gets into the alpha stage. I read on a ticket that it can already compile Tango, which is pretty impressive. Keep up the good work.

    Posted 06 Nov 2008 at 0:02
  5. software developers wrote:

    Hey, that was interesting,

    The code has come in very useful,

    Thanks for sharing,

    Keep up the good work

    Posted 23 Oct 2009 at 11:20
  6. Garrison wrote:

    Although I don’t know the D language, I believe there is a typo in the definition of _Unwind_Action as the values should be or’able. From my readings HANDLER_PHASE should have the value 4 and FORCE_UNWIND should have the value 8. Of course you may have pre-massaged values behind the scenes that I’m not seeing.

    Posted 18 Dec 2009 at 17:41
  7. Christian wrote:

    Garrison: You’re right, I’ve fixed _Unwind_Action. In addition, HANDLER_PHASE should actually have been HANDLER_FRAME. Check http://www.dsource.org/projects/ldc/changeset/1597%3A761bf823e59e .

    Posted 18 Dec 2009 at 20:00