1*9712c20fSFrederick MayleThe goal of this document is to give an overview of the exception handling 2*9712c20fSFrederick Mayleoptions in breakpad. 3*9712c20fSFrederick Mayle 4*9712c20fSFrederick Mayle# Basics 5*9712c20fSFrederick Mayle 6*9712c20fSFrederick MayleException handling is a mechanism designed to handle the occurrence of 7*9712c20fSFrederick Mayleexceptions, special conditions that change the normal flow of program execution. 8*9712c20fSFrederick Mayle 9*9712c20fSFrederick Mayle`SetUnhandledExceptionFilter` replaces all unhandled exceptions when Breakpad is 10*9712c20fSFrederick Mayleenabled. TODO: More on first and second change and vectored v. try/catch. 11*9712c20fSFrederick Mayle 12*9712c20fSFrederick MayleThere are two main types of exceptions across all platforms: in-process and 13*9712c20fSFrederick Mayleout-of-process. 14*9712c20fSFrederick Mayle 15*9712c20fSFrederick Mayle# In-Process 16*9712c20fSFrederick Mayle 17*9712c20fSFrederick MayleIn process exception handling is relatively simple since the crashing process 18*9712c20fSFrederick Maylehandles crash reporting. It is generally considered unsafe to write a minidump 19*9712c20fSFrederick Maylefrom a crashed process. For example, key data structures could be corrupted or 20*9712c20fSFrederick Maylethe stack on which the exception handler runs could have been overwritten. For 21*9712c20fSFrederick Maylethis reason all platforms also support some level of out-of-process exception 22*9712c20fSFrederick Maylehandling. 23*9712c20fSFrederick Mayle 24*9712c20fSFrederick Mayle## Windows 25*9712c20fSFrederick Mayle 26*9712c20fSFrederick MayleIn-process exception handling Breakpad creates a 'handler head' that waits 27*9712c20fSFrederick Mayleinfinitely on a semaphore at start up. When this thread is woken it writes the 28*9712c20fSFrederick Mayleminidump and signals to the excepting thread that it may continue. A filter will 29*9712c20fSFrederick Mayletell the OS to kill the process if the minidump is written successfully. 30*9712c20fSFrederick MayleOtherwise it continues. 31*9712c20fSFrederick Mayle 32*9712c20fSFrederick Mayle# Out-of-Process 33*9712c20fSFrederick Mayle 34*9712c20fSFrederick MayleOut-of-process exception handling is more complicated than in-process exception 35*9712c20fSFrederick Maylehandling because of the need to set up a separate process that can read the 36*9712c20fSFrederick Maylestate of the crashing process. 37*9712c20fSFrederick Mayle 38*9712c20fSFrederick Mayle## Windows 39*9712c20fSFrederick Mayle 40*9712c20fSFrederick MayleBreakpad uses two abstractions around the exception handler to make things work: 41*9712c20fSFrederick Mayle`CrashGenerationServer` and `CrashGenerationClient`. The constructor for these 42*9712c20fSFrederick Mayletakes a named pipe name. 43*9712c20fSFrederick Mayle 44*9712c20fSFrederick MayleDuring server start up a named pipe and registers callbacks for client 45*9712c20fSFrederick Mayleconnections are created. The named pipe is used for registration and all IO on 46*9712c20fSFrederick Maylethe pipe is done asynchronously. `OnPipeConnected` is called when a client 47*9712c20fSFrederick Mayleattempts to connect (call `CreateFile` on the pipe). `OnPipeConnected` does the 48*9712c20fSFrederick Maylestate machine transition from `Initial` to `Connecting` and on through 49*9712c20fSFrederick Mayle`Reading`, `Reading_Done`, `Writing`, `Writing_Done`, `Reading_ACK`, and 50*9712c20fSFrederick Mayle`Disconnecting`. 51*9712c20fSFrederick Mayle 52*9712c20fSFrederick MayleWhen registering callbacks, the client passes in two pointers to pointers: 1. A 53*9712c20fSFrederick Maylepointer to the `EXCEPTION_INFO` pointer 1. A pointer to the `MDRawAssertionInfo` 54*9712c20fSFrederick Maylewhich handles various non-exception failures like assertions 55*9712c20fSFrederick Mayle 56*9712c20fSFrederick MayleThe essence of registration is adding a "`ClientInfo`" object that contains 57*9712c20fSFrederick Maylehandles used for synchronization with the crashing process to an array 58*9712c20fSFrederick Maylemaintained by the server. This is how we can keep track of all the clients on 59*9712c20fSFrederick Maylethe system that have registered for minidumps. These handles are: * 60*9712c20fSFrederick Mayle`server_died(mutex)` * `dump_requested(Event)` * `dump_generated(Event)` 61*9712c20fSFrederick Mayle 62*9712c20fSFrederick MayleThe server registers asynchronous waits on these events with the `ClientInfo` 63*9712c20fSFrederick Mayleobject as the callback context. When the `dump_requested` event is set by the 64*9712c20fSFrederick Mayleclient, the `OnDumpRequested()` callback is called. The server uses the handles 65*9712c20fSFrederick Mayleinside `ClientInfo` to communicate with the child process. Once the child sets 66*9712c20fSFrederick Maylethe event, it waits for two objects: 1. the `dump_generated` event 1. the 67*9712c20fSFrederick Mayle`server_died` mutex 68*9712c20fSFrederick Mayle 69*9712c20fSFrederick MayleIn the end handles are "duped" into the client process, and the clients use 70*9712c20fSFrederick Mayle`SetEvent` to request events, wait on the other event, or the `server_died` 71*9712c20fSFrederick Maylemutex. 72*9712c20fSFrederick Mayle 73*9712c20fSFrederick Mayle## Linux 74*9712c20fSFrederick Mayle 75*9712c20fSFrederick Mayle### Current Status 76*9712c20fSFrederick Mayle 77*9712c20fSFrederick MayleAs of July 2011, Linux had a minidump generator that is not entirely 78*9712c20fSFrederick Mayleout-of-process. The minidump was generated from a separate process, but one that 79*9712c20fSFrederick Mayleshared an address space, file descriptors, signal handles and much else with the 80*9712c20fSFrederick Maylecrashing process. It worked by using the `clone()` system call to duplicate the 81*9712c20fSFrederick Maylecrashing process, and then uses `ptrace()` and the `/proc` file system to 82*9712c20fSFrederick Mayleretrieve the information required to write the minidump. Since then Breakpad has 83*9712c20fSFrederick Mayleupdated Linux exception handling to provide more benefits of out-of-process 84*9712c20fSFrederick Maylereport generation. 85*9712c20fSFrederick Mayle 86*9712c20fSFrederick Mayle### Proposed Design 87*9712c20fSFrederick Mayle 88*9712c20fSFrederick Mayle#### Overview 89*9712c20fSFrederick Mayle 90*9712c20fSFrederick MayleBreakpad would use a per-user daemon to write out a minidump that does not have, 91*9712c20fSFrederick Mayleinteract with or depend on the crashing process. We don't want to start a new 92*9712c20fSFrederick Mayleseparate process every time a user launches a Breakpad-enabled process. Doing 93*9712c20fSFrederick Mayleone daemon per machine is unacceptable for security concerns around one user 94*9712c20fSFrederick Maylebeing able to initiate a minidump generation for another user's process. 95*9712c20fSFrederick Mayle 96*9712c20fSFrederick Mayle#### Client/Server Communication 97*9712c20fSFrederick Mayle 98*9712c20fSFrederick MayleOn Breakpad initialization in a process, the initializer would check if the 99*9712c20fSFrederick Mayledaemon is running and, if not, start it. The race condition between the check 100*9712c20fSFrederick Mayleand the initialization is not a problem because multiple daemons can check if 101*9712c20fSFrederick Maylethe IPC endpoint already exists and if a server is listening. Even if multiple 102*9712c20fSFrederick Maylecopies of the daemon try to `bind()` the filesystem to name the socket, all but 103*9712c20fSFrederick Mayleone will fail and can terminate. 104*9712c20fSFrederick Mayle 105*9712c20fSFrederick MayleThis point is relevant for error handling conditions. Linux does not clean the 106*9712c20fSFrederick Maylefile system representation of a UNIX domain socket even if both endpoints 107*9712c20fSFrederick Mayleterminate, so checking for existence is not strong enough. However checking the 108*9712c20fSFrederick Mayleprocess list or sending a ping on the socket can handle this. 109*9712c20fSFrederick Mayle 110*9712c20fSFrederick MayleBreakpad uses UNIX domain sockets since they support full duplex communication 111*9712c20fSFrederick Mayle(unlike Windows, named pipes on Linux are half) and the kernal automatically 112*9712c20fSFrederick Maylecreates a private channel between the client and server once the client calls 113*9712c20fSFrederick Mayle`connect()`. 114*9712c20fSFrederick Mayle 115*9712c20fSFrederick Mayle#### Minidump Generation 116*9712c20fSFrederick Mayle 117*9712c20fSFrederick MayleBreakpad could use the current system with `ptrace()` and `/proc` within the 118*9712c20fSFrederick Mayledaemon executable. 119*9712c20fSFrederick Mayle 120*9712c20fSFrederick MayleOverall the operations look like: 1. Signal from OS indicating crash 1. Signal 121*9712c20fSFrederick MayleHandler suspends all threads except itself 1. Signal Handler sends 122*9712c20fSFrederick Mayle`CRASH_DUMP_REQUEST` message to server and waits for response 1. Server inspects 123*9712c20fSFrederick Mayle1. Minidump is asynchronously written to disk by the server 1. Server responds 124*9712c20fSFrederick Mayleindicating inspection is done 125*9712c20fSFrederick Mayle 126*9712c20fSFrederick Mayle## Mac OSX 127*9712c20fSFrederick Mayle 128*9712c20fSFrederick MayleOut-of-process exception handling is fully supported on Mac. 129