xref: /aosp_15_r20/external/google-breakpad/docs/exception_handling.md (revision 9712c20fc9bbfbac4935993a2ca0b3958c5adad2)
1The goal of this document is to give an overview of the exception handling
2options in breakpad.
3
4# Basics
5
6Exception handling is a mechanism designed to handle the occurrence of
7exceptions, special conditions that change the normal flow of program execution.
8
9`SetUnhandledExceptionFilter` replaces all unhandled exceptions when Breakpad is
10enabled. TODO: More on first and second change and vectored v. try/catch.
11
12There are two main types of exceptions across all platforms: in-process and
13out-of-process.
14
15# In-Process
16
17In process exception handling is relatively simple since the crashing process
18handles crash reporting. It is generally considered unsafe to write a minidump
19from a crashed process. For example, key data structures could be corrupted or
20the stack on which the exception handler runs could have been overwritten. For
21this reason all platforms also support some level of out-of-process exception
22handling.
23
24## Windows
25
26In-process exception handling Breakpad creates a 'handler head' that waits
27infinitely on a semaphore at start up. When this thread is woken it writes the
28minidump and signals to the excepting thread that it may continue. A filter will
29tell the OS to kill the process if the minidump is written successfully.
30Otherwise it continues.
31
32# Out-of-Process
33
34Out-of-process exception handling is more complicated than in-process exception
35handling because of the need to set up a separate process that can read the
36state of the crashing process.
37
38## Windows
39
40Breakpad uses two abstractions around the exception handler to make things work:
41`CrashGenerationServer` and `CrashGenerationClient`. The constructor for these
42takes a named pipe name.
43
44During server start up a named pipe and registers callbacks for client
45connections are created. The named pipe is used for registration and all IO on
46the pipe is done asynchronously. `OnPipeConnected` is called when a client
47attempts to connect (call `CreateFile` on the pipe). `OnPipeConnected` does the
48state machine transition from `Initial` to `Connecting` and on through
49`Reading`, `Reading_Done`, `Writing`, `Writing_Done`, `Reading_ACK`, and
50`Disconnecting`.
51
52When registering callbacks, the client passes in two pointers to pointers: 1. A
53pointer to the `EXCEPTION_INFO` pointer 1. A pointer to the `MDRawAssertionInfo`
54which handles various non-exception failures like assertions
55
56The essence of registration is adding a "`ClientInfo`" object that contains
57handles used for synchronization with the crashing process to an array
58maintained by the server. This is how we can keep track of all the clients on
59the system that have registered for minidumps. These handles are: *
60`server_died(mutex)` * `dump_requested(Event)` * `dump_generated(Event)`
61
62The server registers asynchronous waits on these events with the `ClientInfo`
63object as the callback context. When the `dump_requested` event is set by the
64client, the `OnDumpRequested()` callback is called. The server uses the handles
65inside `ClientInfo` to communicate with the child process. Once the child sets
66the event, it waits for two objects: 1. the `dump_generated` event 1. the
67`server_died` mutex
68
69In the end handles are "duped" into the client process, and the clients use
70`SetEvent` to request events, wait on the other event, or the `server_died`
71mutex.
72
73## Linux
74
75### Current Status
76
77As of July 2011, Linux had a minidump generator that is not entirely
78out-of-process. The minidump was generated from a separate process, but one that
79shared an address space, file descriptors, signal handles and much else with the
80crashing process. It worked by using the `clone()` system call to duplicate the
81crashing process, and then uses `ptrace()` and the `/proc` file system to
82retrieve the information required to write the minidump. Since then Breakpad has
83updated Linux exception handling to provide more benefits of out-of-process
84report generation.
85
86### Proposed Design
87
88#### Overview
89
90Breakpad would use a per-user daemon to write out a minidump that does not have,
91interact with or depend on the crashing process. We don't want to start a new
92separate process every time a user launches a Breakpad-enabled process. Doing
93one daemon per machine is unacceptable for security concerns around one user
94being able to initiate a minidump generation for another user's process.
95
96#### Client/Server Communication
97
98On Breakpad initialization in a process, the initializer would check if the
99daemon is running and, if not, start it. The race condition between the check
100and the initialization is not a problem because multiple daemons can check if
101the IPC endpoint already exists and if a server is listening. Even if multiple
102copies of the daemon try to `bind()` the filesystem to name the socket, all but
103one will fail and can terminate.
104
105This point is relevant for error handling conditions. Linux does not clean the
106file system representation of a UNIX domain socket even if both endpoints
107terminate, so checking for existence is not strong enough. However checking the
108process list or sending a ping on the socket can handle this.
109
110Breakpad uses UNIX domain sockets since they support full duplex communication
111(unlike Windows, named pipes on Linux are half) and the kernal automatically
112creates a private channel between the client and server once the client calls
113`connect()`.
114
115#### Minidump Generation
116
117Breakpad could use the current system with `ptrace()` and `/proc` within the
118daemon executable.
119
120Overall the operations look like: 1. Signal from OS indicating crash 1. Signal
121Handler suspends all threads except itself 1. Signal Handler sends
122`CRASH_DUMP_REQUEST` message to server and waits for response 1. Server inspects
1231. Minidump is asynchronously written to disk by the server 1. Server responds
124indicating inspection is done
125
126## Mac OSX
127
128Out-of-process exception handling is fully supported on Mac.
129