xref: /aosp_15_r20/external/grpc-grpc/doc/connectivity-semantics-and-api.md (revision cc02d7e222339f7a4f6ba5f422e6413f4bd931f2)
1*cc02d7e2SAndroid Build Coastguard WorkergRPC Connectivity Semantics and API
2*cc02d7e2SAndroid Build Coastguard Worker===================================
3*cc02d7e2SAndroid Build Coastguard Worker
4*cc02d7e2SAndroid Build Coastguard WorkerThis document describes the connectivity semantics for gRPC channels and the
5*cc02d7e2SAndroid Build Coastguard Workercorresponding impact on RPCs. We then discuss an API.
6*cc02d7e2SAndroid Build Coastguard Worker
7*cc02d7e2SAndroid Build Coastguard WorkerStates of Connectivity
8*cc02d7e2SAndroid Build Coastguard Worker----------------------
9*cc02d7e2SAndroid Build Coastguard Worker
10*cc02d7e2SAndroid Build Coastguard WorkergRPC Channels provide the abstraction over which clients can communicate with
11*cc02d7e2SAndroid Build Coastguard Workerservers.The client-side channel object can be constructed using little more
12*cc02d7e2SAndroid Build Coastguard Workerthan a DNS name. Channels encapsulate a range of functionality including name
13*cc02d7e2SAndroid Build Coastguard Workerresolution, establishing a TCP connection (with retries and backoff) and TLS
14*cc02d7e2SAndroid Build Coastguard Workerhandshakes. Channels can also handle errors on established connections and
15*cc02d7e2SAndroid Build Coastguard Workerreconnect, or in the case of HTTP/2 GO_AWAY, re-resolve the name and reconnect.
16*cc02d7e2SAndroid Build Coastguard Worker
17*cc02d7e2SAndroid Build Coastguard WorkerTo hide the details of all this activity from the user of the gRPC API (i.e.,
18*cc02d7e2SAndroid Build Coastguard Workerapplication code) while exposing meaningful information about the state of a
19*cc02d7e2SAndroid Build Coastguard Workerchannel, we use a state machine with five states, defined below:
20*cc02d7e2SAndroid Build Coastguard Worker
21*cc02d7e2SAndroid Build Coastguard WorkerCONNECTING: The channel is trying to establish a connection and is waiting to
22*cc02d7e2SAndroid Build Coastguard Workermake progress on one of the steps involved in name resolution, TCP connection
23*cc02d7e2SAndroid Build Coastguard Workerestablishment or TLS handshake. This may be used as the initial state for channels upon
24*cc02d7e2SAndroid Build Coastguard Workercreation.
25*cc02d7e2SAndroid Build Coastguard Worker
26*cc02d7e2SAndroid Build Coastguard WorkerREADY: The channel has successfully established a connection all the way through
27*cc02d7e2SAndroid Build Coastguard WorkerTLS handshake (or equivalent) and protocol-level (HTTP/2, etc) handshaking, and
28*cc02d7e2SAndroid Build Coastguard Workerall subsequent attempt to communicate have succeeded (or are pending without any
29*cc02d7e2SAndroid Build Coastguard Workerknown failure).
30*cc02d7e2SAndroid Build Coastguard Worker
31*cc02d7e2SAndroid Build Coastguard WorkerTRANSIENT_FAILURE: There has been some transient failure (such as a TCP 3-way
32*cc02d7e2SAndroid Build Coastguard Workerhandshake timing out or a socket error). Channels in this state will eventually
33*cc02d7e2SAndroid Build Coastguard Workerswitch to the CONNECTING state and try to establish a connection again. Since
34*cc02d7e2SAndroid Build Coastguard Workerretries are done with exponential backoff, channels that fail to connect will
35*cc02d7e2SAndroid Build Coastguard Workerstart out spending very little time in this state but as the attempts fail
36*cc02d7e2SAndroid Build Coastguard Workerrepeatedly, the channel will spend increasingly large amounts of time in this
37*cc02d7e2SAndroid Build Coastguard Workerstate. For many non-fatal failures (e.g., TCP connection attempts timing out
38*cc02d7e2SAndroid Build Coastguard Workerbecause the server is not yet available), the channel may spend increasingly
39*cc02d7e2SAndroid Build Coastguard Workerlarge amounts of time in this state.
40*cc02d7e2SAndroid Build Coastguard Worker
41*cc02d7e2SAndroid Build Coastguard WorkerIDLE: This is the state where the channel is not even trying to create a
42*cc02d7e2SAndroid Build Coastguard Workerconnection because of a lack of new or pending RPCs. New RPCs  MAY be created
43*cc02d7e2SAndroid Build Coastguard Workerin this state. Any attempt to start an RPC on the channel will push the channel
44*cc02d7e2SAndroid Build Coastguard Workerout of this state to connecting. When there has been no RPC activity on a channel
45*cc02d7e2SAndroid Build Coastguard Workerfor a specified IDLE_TIMEOUT, i.e., no new or pending (active) RPCs for this
46*cc02d7e2SAndroid Build Coastguard Workerperiod, channels that are READY or CONNECTING switch to IDLE. Additionally,
47*cc02d7e2SAndroid Build Coastguard Workerchannels that receive a GOAWAY when there are no active or pending RPCs should
48*cc02d7e2SAndroid Build Coastguard Workeralso switch to IDLE to avoid connection overload at servers that are attempting
49*cc02d7e2SAndroid Build Coastguard Workerto shed connections. We will use a default IDLE_TIMEOUT of 300 seconds (5 minutes).
50*cc02d7e2SAndroid Build Coastguard Worker
51*cc02d7e2SAndroid Build Coastguard WorkerSHUTDOWN: This channel has started shutting down. Any new RPCs should fail
52*cc02d7e2SAndroid Build Coastguard Workerimmediately. Pending RPCs may continue running till the application cancels them.
53*cc02d7e2SAndroid Build Coastguard WorkerChannels may enter this state either because the application explicitly requested
54*cc02d7e2SAndroid Build Coastguard Workera shutdown or if a non-recoverable error has happened during attempts to connect
55*cc02d7e2SAndroid Build Coastguard Workercommunicate . (As of 6/12/2015, there are no known errors (while connecting or
56*cc02d7e2SAndroid Build Coastguard Workercommunicating) that are classified as non-recoverable.)  Channels that enter this
57*cc02d7e2SAndroid Build Coastguard Workerstate never leave this state.
58*cc02d7e2SAndroid Build Coastguard Worker
59*cc02d7e2SAndroid Build Coastguard WorkerThe following table lists the legal transitions from one state to another and
60*cc02d7e2SAndroid Build Coastguard Workercorresponding reasons. Empty cells denote disallowed transitions.
61*cc02d7e2SAndroid Build Coastguard Worker
62*cc02d7e2SAndroid Build Coastguard Worker<table style='border: 1px solid black'>
63*cc02d7e2SAndroid Build Coastguard Worker  <tr>
64*cc02d7e2SAndroid Build Coastguard Worker    <th>From/To</th>
65*cc02d7e2SAndroid Build Coastguard Worker    <th>CONNECTING</th>
66*cc02d7e2SAndroid Build Coastguard Worker    <th>READY</th>
67*cc02d7e2SAndroid Build Coastguard Worker    <th>TRANSIENT_FAILURE</th>
68*cc02d7e2SAndroid Build Coastguard Worker    <th>IDLE</th>
69*cc02d7e2SAndroid Build Coastguard Worker    <th>SHUTDOWN</th>
70*cc02d7e2SAndroid Build Coastguard Worker  </tr>
71*cc02d7e2SAndroid Build Coastguard Worker  <tr>
72*cc02d7e2SAndroid Build Coastguard Worker    <th>CONNECTING</th>
73*cc02d7e2SAndroid Build Coastguard Worker    <td>Incremental progress during connection establishment</td>
74*cc02d7e2SAndroid Build Coastguard Worker    <td>All steps needed to establish a connection succeeded</td>
75*cc02d7e2SAndroid Build Coastguard Worker    <td>Any failure in any of the steps needed to establish connection</td>
76*cc02d7e2SAndroid Build Coastguard Worker    <td>No RPC activity on channel for IDLE_TIMEOUT</td>
77*cc02d7e2SAndroid Build Coastguard Worker    <td>Shutdown triggered by application.</td>
78*cc02d7e2SAndroid Build Coastguard Worker  </tr>
79*cc02d7e2SAndroid Build Coastguard Worker  <tr>
80*cc02d7e2SAndroid Build Coastguard Worker    <th>READY</th>
81*cc02d7e2SAndroid Build Coastguard Worker    <td></td>
82*cc02d7e2SAndroid Build Coastguard Worker    <td>Incremental successful communication on established channel.</td>
83*cc02d7e2SAndroid Build Coastguard Worker    <td>Any failure encountered while expecting successful communication on
84*cc02d7e2SAndroid Build Coastguard Worker        established channel.</td>
85*cc02d7e2SAndroid Build Coastguard Worker    <td>No RPC activity on channel for IDLE_TIMEOUT <br>OR<br>upon receiving a GOAWAY while there are no pending RPCs.</td>
86*cc02d7e2SAndroid Build Coastguard Worker    <td>Shutdown triggered by application.</td>
87*cc02d7e2SAndroid Build Coastguard Worker  </tr>
88*cc02d7e2SAndroid Build Coastguard Worker  <tr>
89*cc02d7e2SAndroid Build Coastguard Worker    <th>TRANSIENT_FAILURE</th>
90*cc02d7e2SAndroid Build Coastguard Worker    <td>Wait time required to implement (exponential) backoff is over.</td>
91*cc02d7e2SAndroid Build Coastguard Worker    <td></td>
92*cc02d7e2SAndroid Build Coastguard Worker    <td></td>
93*cc02d7e2SAndroid Build Coastguard Worker    <td></td>
94*cc02d7e2SAndroid Build Coastguard Worker    <td>Shutdown triggered by application.</td>
95*cc02d7e2SAndroid Build Coastguard Worker  </tr>
96*cc02d7e2SAndroid Build Coastguard Worker  <tr>
97*cc02d7e2SAndroid Build Coastguard Worker    <th>IDLE</th>
98*cc02d7e2SAndroid Build Coastguard Worker    <td>Any new RPC activity on the channel</td>
99*cc02d7e2SAndroid Build Coastguard Worker    <td></td>
100*cc02d7e2SAndroid Build Coastguard Worker    <td></td>
101*cc02d7e2SAndroid Build Coastguard Worker    <td></td>
102*cc02d7e2SAndroid Build Coastguard Worker    <td>Shutdown triggered by application.</td>
103*cc02d7e2SAndroid Build Coastguard Worker  </tr>
104*cc02d7e2SAndroid Build Coastguard Worker  <tr>
105*cc02d7e2SAndroid Build Coastguard Worker    <th>SHUTDOWN</th>
106*cc02d7e2SAndroid Build Coastguard Worker    <td></td>
107*cc02d7e2SAndroid Build Coastguard Worker    <td></td>
108*cc02d7e2SAndroid Build Coastguard Worker    <td></td>
109*cc02d7e2SAndroid Build Coastguard Worker    <td></td>
110*cc02d7e2SAndroid Build Coastguard Worker    <td></td>
111*cc02d7e2SAndroid Build Coastguard Worker  </tr>
112*cc02d7e2SAndroid Build Coastguard Worker</table>
113*cc02d7e2SAndroid Build Coastguard Worker
114*cc02d7e2SAndroid Build Coastguard Worker
115*cc02d7e2SAndroid Build Coastguard WorkerChannel State API
116*cc02d7e2SAndroid Build Coastguard Worker-----------------
117*cc02d7e2SAndroid Build Coastguard Worker
118*cc02d7e2SAndroid Build Coastguard WorkerAll gRPC libraries will expose a channel-level API method to poll the current
119*cc02d7e2SAndroid Build Coastguard Workerstate of a channel. In C++, this method is called GetState and returns an enum
120*cc02d7e2SAndroid Build Coastguard Workerfor one of the five legal states. It also accepts a boolean `try_to_connect` to
121*cc02d7e2SAndroid Build Coastguard Workertransition to CONNECTING if the channel is currently IDLE. The boolean should
122*cc02d7e2SAndroid Build Coastguard Workeract as if an RPC occurred, so it should also reset IDLE_TIMEOUT.
123*cc02d7e2SAndroid Build Coastguard Worker
124*cc02d7e2SAndroid Build Coastguard Worker```cpp
125*cc02d7e2SAndroid Build Coastguard Workergrpc_connectivity_state GetState(bool try_to_connect);
126*cc02d7e2SAndroid Build Coastguard Worker```
127*cc02d7e2SAndroid Build Coastguard Worker
128*cc02d7e2SAndroid Build Coastguard WorkerAll libraries should also expose an API that enables the application (user of
129*cc02d7e2SAndroid Build Coastguard Workerthe gRPC API) to be notified when the channel state changes. Since state
130*cc02d7e2SAndroid Build Coastguard Workerchanges can be rapid and race with any such notification, the notification
131*cc02d7e2SAndroid Build Coastguard Workershould just inform the user that some state change has happened, leaving it to
132*cc02d7e2SAndroid Build Coastguard Workerthe user to poll the channel for the current state.
133*cc02d7e2SAndroid Build Coastguard Worker
134*cc02d7e2SAndroid Build Coastguard WorkerThe synchronous version of this API is:
135*cc02d7e2SAndroid Build Coastguard Worker
136*cc02d7e2SAndroid Build Coastguard Worker```cpp
137*cc02d7e2SAndroid Build Coastguard Workerbool WaitForStateChange(grpc_connectivity_state source_state, gpr_timespec deadline);
138*cc02d7e2SAndroid Build Coastguard Worker```
139*cc02d7e2SAndroid Build Coastguard Worker
140*cc02d7e2SAndroid Build Coastguard Workerwhich returns `true` when the state is something other than the
141*cc02d7e2SAndroid Build Coastguard Worker`source_state` and `false` if the deadline expires. Asynchronous- and futures-based
142*cc02d7e2SAndroid Build Coastguard WorkerAPIs should have a corresponding method that allows the application to be
143*cc02d7e2SAndroid Build Coastguard Workernotified when the state of a channel changes.
144*cc02d7e2SAndroid Build Coastguard Worker
145*cc02d7e2SAndroid Build Coastguard WorkerNote that a notification is delivered every time there is a transition from any
146*cc02d7e2SAndroid Build Coastguard Workerstate to any *other* state. On the other hand the rules for legal state
147*cc02d7e2SAndroid Build Coastguard Workertransition, require a transition from CONNECTING to TRANSIENT_FAILURE and back
148*cc02d7e2SAndroid Build Coastguard Workerto CONNECTING for every recoverable failure, even if the corresponding
149*cc02d7e2SAndroid Build Coastguard Workerexponential backoff requires no wait before retry. The combined effect is that
150*cc02d7e2SAndroid Build Coastguard Workerthe application may receive state change notifications that appear spurious.
151*cc02d7e2SAndroid Build Coastguard Workere.g., an application waiting for state changes on a channel that is CONNECTING
152*cc02d7e2SAndroid Build Coastguard Workermay receive a state change notification but find the channel in the same
153*cc02d7e2SAndroid Build Coastguard WorkerCONNECTING state on polling for current state because the channel may have
154*cc02d7e2SAndroid Build Coastguard Workerspent infinitesimally small amount of time in the TRANSIENT_FAILURE state.
155