1The goal of this document is to give an overview of the exception handling 2options in breakpad. 3 4# Basics 5 6Exception handling is a mechanism designed to handle the occurrence of 7exceptions, special conditions that change the normal flow of program execution. 8 9`SetUnhandledExceptionFilter` replaces all unhandled exceptions when Breakpad is 10enabled. TODO: More on first and second change and vectored v. try/catch. 11 12There are two main types of exceptions across all platforms: in-process and 13out-of-process. 14 15# In-Process 16 17In process exception handling is relatively simple since the crashing process 18handles crash reporting. It is generally considered unsafe to write a minidump 19from a crashed process. For example, key data structures could be corrupted or 20the stack on which the exception handler runs could have been overwritten. For 21this reason all platforms also support some level of out-of-process exception 22handling. 23 24## Windows 25 26In-process exception handling Breakpad creates a 'handler head' that waits 27infinitely on a semaphore at start up. When this thread is woken it writes the 28minidump and signals to the excepting thread that it may continue. A filter will 29tell the OS to kill the process if the minidump is written successfully. 30Otherwise it continues. 31 32# Out-of-Process 33 34Out-of-process exception handling is more complicated than in-process exception 35handling because of the need to set up a separate process that can read the 36state of the crashing process. 37 38## Windows 39 40Breakpad uses two abstractions around the exception handler to make things work: 41`CrashGenerationServer` and `CrashGenerationClient`. The constructor for these 42takes a named pipe name. 43 44During server start up a named pipe and registers callbacks for client 45connections are created. The named pipe is used for registration and all IO on 46the pipe is done asynchronously. `OnPipeConnected` is called when a client 47attempts to connect (call `CreateFile` on the pipe). `OnPipeConnected` does the 48state machine transition from `Initial` to `Connecting` and on through 49`Reading`, `Reading_Done`, `Writing`, `Writing_Done`, `Reading_ACK`, and 50`Disconnecting`. 51 52When registering callbacks, the client passes in two pointers to pointers: 1. A 53pointer to the `EXCEPTION_INFO` pointer 1. A pointer to the `MDRawAssertionInfo` 54which handles various non-exception failures like assertions 55 56The essence of registration is adding a "`ClientInfo`" object that contains 57handles used for synchronization with the crashing process to an array 58maintained by the server. This is how we can keep track of all the clients on 59the system that have registered for minidumps. These handles are: * 60`server_died(mutex)` * `dump_requested(Event)` * `dump_generated(Event)` 61 62The server registers asynchronous waits on these events with the `ClientInfo` 63object as the callback context. When the `dump_requested` event is set by the 64client, the `OnDumpRequested()` callback is called. The server uses the handles 65inside `ClientInfo` to communicate with the child process. Once the child sets 66the event, it waits for two objects: 1. the `dump_generated` event 1. the 67`server_died` mutex 68 69In the end handles are "duped" into the client process, and the clients use 70`SetEvent` to request events, wait on the other event, or the `server_died` 71mutex. 72 73## Linux 74 75### Current Status 76 77As of July 2011, Linux had a minidump generator that is not entirely 78out-of-process. The minidump was generated from a separate process, but one that 79shared an address space, file descriptors, signal handles and much else with the 80crashing process. It worked by using the `clone()` system call to duplicate the 81crashing process, and then uses `ptrace()` and the `/proc` file system to 82retrieve the information required to write the minidump. Since then Breakpad has 83updated Linux exception handling to provide more benefits of out-of-process 84report generation. 85 86### Proposed Design 87 88#### Overview 89 90Breakpad would use a per-user daemon to write out a minidump that does not have, 91interact with or depend on the crashing process. We don't want to start a new 92separate process every time a user launches a Breakpad-enabled process. Doing 93one daemon per machine is unacceptable for security concerns around one user 94being able to initiate a minidump generation for another user's process. 95 96#### Client/Server Communication 97 98On Breakpad initialization in a process, the initializer would check if the 99daemon is running and, if not, start it. The race condition between the check 100and the initialization is not a problem because multiple daemons can check if 101the IPC endpoint already exists and if a server is listening. Even if multiple 102copies of the daemon try to `bind()` the filesystem to name the socket, all but 103one will fail and can terminate. 104 105This point is relevant for error handling conditions. Linux does not clean the 106file system representation of a UNIX domain socket even if both endpoints 107terminate, so checking for existence is not strong enough. However checking the 108process list or sending a ping on the socket can handle this. 109 110Breakpad uses UNIX domain sockets since they support full duplex communication 111(unlike Windows, named pipes on Linux are half) and the kernal automatically 112creates a private channel between the client and server once the client calls 113`connect()`. 114 115#### Minidump Generation 116 117Breakpad could use the current system with `ptrace()` and `/proc` within the 118daemon executable. 119 120Overall the operations look like: 1. Signal from OS indicating crash 1. Signal 121Handler suspends all threads except itself 1. Signal Handler sends 122`CRASH_DUMP_REQUEST` message to server and waits for response 1. Server inspects 1231. Minidump is asynchronously written to disk by the server 1. Server responds 124indicating inspection is done 125 126## Mac OSX 127 128Out-of-process exception handling is fully supported on Mac. 129