1*cc02d7e2SAndroid Build Coastguard WorkergRPC Connectivity Semantics and API 2*cc02d7e2SAndroid Build Coastguard Worker=================================== 3*cc02d7e2SAndroid Build Coastguard Worker 4*cc02d7e2SAndroid Build Coastguard WorkerThis document describes the connectivity semantics for gRPC channels and the 5*cc02d7e2SAndroid Build Coastguard Workercorresponding impact on RPCs. We then discuss an API. 6*cc02d7e2SAndroid Build Coastguard Worker 7*cc02d7e2SAndroid Build Coastguard WorkerStates of Connectivity 8*cc02d7e2SAndroid Build Coastguard Worker---------------------- 9*cc02d7e2SAndroid Build Coastguard Worker 10*cc02d7e2SAndroid Build Coastguard WorkergRPC Channels provide the abstraction over which clients can communicate with 11*cc02d7e2SAndroid Build Coastguard Workerservers.The client-side channel object can be constructed using little more 12*cc02d7e2SAndroid Build Coastguard Workerthan a DNS name. Channels encapsulate a range of functionality including name 13*cc02d7e2SAndroid Build Coastguard Workerresolution, establishing a TCP connection (with retries and backoff) and TLS 14*cc02d7e2SAndroid Build Coastguard Workerhandshakes. Channels can also handle errors on established connections and 15*cc02d7e2SAndroid Build Coastguard Workerreconnect, or in the case of HTTP/2 GO_AWAY, re-resolve the name and reconnect. 16*cc02d7e2SAndroid Build Coastguard Worker 17*cc02d7e2SAndroid Build Coastguard WorkerTo hide the details of all this activity from the user of the gRPC API (i.e., 18*cc02d7e2SAndroid Build Coastguard Workerapplication code) while exposing meaningful information about the state of a 19*cc02d7e2SAndroid Build Coastguard Workerchannel, we use a state machine with five states, defined below: 20*cc02d7e2SAndroid Build Coastguard Worker 21*cc02d7e2SAndroid Build Coastguard WorkerCONNECTING: The channel is trying to establish a connection and is waiting to 22*cc02d7e2SAndroid Build Coastguard Workermake progress on one of the steps involved in name resolution, TCP connection 23*cc02d7e2SAndroid Build Coastguard Workerestablishment or TLS handshake. This may be used as the initial state for channels upon 24*cc02d7e2SAndroid Build Coastguard Workercreation. 25*cc02d7e2SAndroid Build Coastguard Worker 26*cc02d7e2SAndroid Build Coastguard WorkerREADY: The channel has successfully established a connection all the way through 27*cc02d7e2SAndroid Build Coastguard WorkerTLS handshake (or equivalent) and protocol-level (HTTP/2, etc) handshaking, and 28*cc02d7e2SAndroid Build Coastguard Workerall subsequent attempt to communicate have succeeded (or are pending without any 29*cc02d7e2SAndroid Build Coastguard Workerknown failure). 30*cc02d7e2SAndroid Build Coastguard Worker 31*cc02d7e2SAndroid Build Coastguard WorkerTRANSIENT_FAILURE: There has been some transient failure (such as a TCP 3-way 32*cc02d7e2SAndroid Build Coastguard Workerhandshake timing out or a socket error). Channels in this state will eventually 33*cc02d7e2SAndroid Build Coastguard Workerswitch to the CONNECTING state and try to establish a connection again. Since 34*cc02d7e2SAndroid Build Coastguard Workerretries are done with exponential backoff, channels that fail to connect will 35*cc02d7e2SAndroid Build Coastguard Workerstart out spending very little time in this state but as the attempts fail 36*cc02d7e2SAndroid Build Coastguard Workerrepeatedly, the channel will spend increasingly large amounts of time in this 37*cc02d7e2SAndroid Build Coastguard Workerstate. For many non-fatal failures (e.g., TCP connection attempts timing out 38*cc02d7e2SAndroid Build Coastguard Workerbecause the server is not yet available), the channel may spend increasingly 39*cc02d7e2SAndroid Build Coastguard Workerlarge amounts of time in this state. 40*cc02d7e2SAndroid Build Coastguard Worker 41*cc02d7e2SAndroid Build Coastguard WorkerIDLE: This is the state where the channel is not even trying to create a 42*cc02d7e2SAndroid Build Coastguard Workerconnection because of a lack of new or pending RPCs. New RPCs MAY be created 43*cc02d7e2SAndroid Build Coastguard Workerin this state. Any attempt to start an RPC on the channel will push the channel 44*cc02d7e2SAndroid Build Coastguard Workerout of this state to connecting. When there has been no RPC activity on a channel 45*cc02d7e2SAndroid Build Coastguard Workerfor a specified IDLE_TIMEOUT, i.e., no new or pending (active) RPCs for this 46*cc02d7e2SAndroid Build Coastguard Workerperiod, channels that are READY or CONNECTING switch to IDLE. Additionally, 47*cc02d7e2SAndroid Build Coastguard Workerchannels that receive a GOAWAY when there are no active or pending RPCs should 48*cc02d7e2SAndroid Build Coastguard Workeralso switch to IDLE to avoid connection overload at servers that are attempting 49*cc02d7e2SAndroid Build Coastguard Workerto shed connections. We will use a default IDLE_TIMEOUT of 300 seconds (5 minutes). 50*cc02d7e2SAndroid Build Coastguard Worker 51*cc02d7e2SAndroid Build Coastguard WorkerSHUTDOWN: This channel has started shutting down. Any new RPCs should fail 52*cc02d7e2SAndroid Build Coastguard Workerimmediately. Pending RPCs may continue running till the application cancels them. 53*cc02d7e2SAndroid Build Coastguard WorkerChannels may enter this state either because the application explicitly requested 54*cc02d7e2SAndroid Build Coastguard Workera shutdown or if a non-recoverable error has happened during attempts to connect 55*cc02d7e2SAndroid Build Coastguard Workercommunicate . (As of 6/12/2015, there are no known errors (while connecting or 56*cc02d7e2SAndroid Build Coastguard Workercommunicating) that are classified as non-recoverable.) Channels that enter this 57*cc02d7e2SAndroid Build Coastguard Workerstate never leave this state. 58*cc02d7e2SAndroid Build Coastguard Worker 59*cc02d7e2SAndroid Build Coastguard WorkerThe following table lists the legal transitions from one state to another and 60*cc02d7e2SAndroid Build Coastguard Workercorresponding reasons. Empty cells denote disallowed transitions. 61*cc02d7e2SAndroid Build Coastguard Worker 62*cc02d7e2SAndroid Build Coastguard Worker<table style='border: 1px solid black'> 63*cc02d7e2SAndroid Build Coastguard Worker <tr> 64*cc02d7e2SAndroid Build Coastguard Worker <th>From/To</th> 65*cc02d7e2SAndroid Build Coastguard Worker <th>CONNECTING</th> 66*cc02d7e2SAndroid Build Coastguard Worker <th>READY</th> 67*cc02d7e2SAndroid Build Coastguard Worker <th>TRANSIENT_FAILURE</th> 68*cc02d7e2SAndroid Build Coastguard Worker <th>IDLE</th> 69*cc02d7e2SAndroid Build Coastguard Worker <th>SHUTDOWN</th> 70*cc02d7e2SAndroid Build Coastguard Worker </tr> 71*cc02d7e2SAndroid Build Coastguard Worker <tr> 72*cc02d7e2SAndroid Build Coastguard Worker <th>CONNECTING</th> 73*cc02d7e2SAndroid Build Coastguard Worker <td>Incremental progress during connection establishment</td> 74*cc02d7e2SAndroid Build Coastguard Worker <td>All steps needed to establish a connection succeeded</td> 75*cc02d7e2SAndroid Build Coastguard Worker <td>Any failure in any of the steps needed to establish connection</td> 76*cc02d7e2SAndroid Build Coastguard Worker <td>No RPC activity on channel for IDLE_TIMEOUT</td> 77*cc02d7e2SAndroid Build Coastguard Worker <td>Shutdown triggered by application.</td> 78*cc02d7e2SAndroid Build Coastguard Worker </tr> 79*cc02d7e2SAndroid Build Coastguard Worker <tr> 80*cc02d7e2SAndroid Build Coastguard Worker <th>READY</th> 81*cc02d7e2SAndroid Build Coastguard Worker <td></td> 82*cc02d7e2SAndroid Build Coastguard Worker <td>Incremental successful communication on established channel.</td> 83*cc02d7e2SAndroid Build Coastguard Worker <td>Any failure encountered while expecting successful communication on 84*cc02d7e2SAndroid Build Coastguard Worker established channel.</td> 85*cc02d7e2SAndroid Build Coastguard Worker <td>No RPC activity on channel for IDLE_TIMEOUT <br>OR<br>upon receiving a GOAWAY while there are no pending RPCs.</td> 86*cc02d7e2SAndroid Build Coastguard Worker <td>Shutdown triggered by application.</td> 87*cc02d7e2SAndroid Build Coastguard Worker </tr> 88*cc02d7e2SAndroid Build Coastguard Worker <tr> 89*cc02d7e2SAndroid Build Coastguard Worker <th>TRANSIENT_FAILURE</th> 90*cc02d7e2SAndroid Build Coastguard Worker <td>Wait time required to implement (exponential) backoff is over.</td> 91*cc02d7e2SAndroid Build Coastguard Worker <td></td> 92*cc02d7e2SAndroid Build Coastguard Worker <td></td> 93*cc02d7e2SAndroid Build Coastguard Worker <td></td> 94*cc02d7e2SAndroid Build Coastguard Worker <td>Shutdown triggered by application.</td> 95*cc02d7e2SAndroid Build Coastguard Worker </tr> 96*cc02d7e2SAndroid Build Coastguard Worker <tr> 97*cc02d7e2SAndroid Build Coastguard Worker <th>IDLE</th> 98*cc02d7e2SAndroid Build Coastguard Worker <td>Any new RPC activity on the channel</td> 99*cc02d7e2SAndroid Build Coastguard Worker <td></td> 100*cc02d7e2SAndroid Build Coastguard Worker <td></td> 101*cc02d7e2SAndroid Build Coastguard Worker <td></td> 102*cc02d7e2SAndroid Build Coastguard Worker <td>Shutdown triggered by application.</td> 103*cc02d7e2SAndroid Build Coastguard Worker </tr> 104*cc02d7e2SAndroid Build Coastguard Worker <tr> 105*cc02d7e2SAndroid Build Coastguard Worker <th>SHUTDOWN</th> 106*cc02d7e2SAndroid Build Coastguard Worker <td></td> 107*cc02d7e2SAndroid Build Coastguard Worker <td></td> 108*cc02d7e2SAndroid Build Coastguard Worker <td></td> 109*cc02d7e2SAndroid Build Coastguard Worker <td></td> 110*cc02d7e2SAndroid Build Coastguard Worker <td></td> 111*cc02d7e2SAndroid Build Coastguard Worker </tr> 112*cc02d7e2SAndroid Build Coastguard Worker</table> 113*cc02d7e2SAndroid Build Coastguard Worker 114*cc02d7e2SAndroid Build Coastguard Worker 115*cc02d7e2SAndroid Build Coastguard WorkerChannel State API 116*cc02d7e2SAndroid Build Coastguard Worker----------------- 117*cc02d7e2SAndroid Build Coastguard Worker 118*cc02d7e2SAndroid Build Coastguard WorkerAll gRPC libraries will expose a channel-level API method to poll the current 119*cc02d7e2SAndroid Build Coastguard Workerstate of a channel. In C++, this method is called GetState and returns an enum 120*cc02d7e2SAndroid Build Coastguard Workerfor one of the five legal states. It also accepts a boolean `try_to_connect` to 121*cc02d7e2SAndroid Build Coastguard Workertransition to CONNECTING if the channel is currently IDLE. The boolean should 122*cc02d7e2SAndroid Build Coastguard Workeract as if an RPC occurred, so it should also reset IDLE_TIMEOUT. 123*cc02d7e2SAndroid Build Coastguard Worker 124*cc02d7e2SAndroid Build Coastguard Worker```cpp 125*cc02d7e2SAndroid Build Coastguard Workergrpc_connectivity_state GetState(bool try_to_connect); 126*cc02d7e2SAndroid Build Coastguard Worker``` 127*cc02d7e2SAndroid Build Coastguard Worker 128*cc02d7e2SAndroid Build Coastguard WorkerAll libraries should also expose an API that enables the application (user of 129*cc02d7e2SAndroid Build Coastguard Workerthe gRPC API) to be notified when the channel state changes. Since state 130*cc02d7e2SAndroid Build Coastguard Workerchanges can be rapid and race with any such notification, the notification 131*cc02d7e2SAndroid Build Coastguard Workershould just inform the user that some state change has happened, leaving it to 132*cc02d7e2SAndroid Build Coastguard Workerthe user to poll the channel for the current state. 133*cc02d7e2SAndroid Build Coastguard Worker 134*cc02d7e2SAndroid Build Coastguard WorkerThe synchronous version of this API is: 135*cc02d7e2SAndroid Build Coastguard Worker 136*cc02d7e2SAndroid Build Coastguard Worker```cpp 137*cc02d7e2SAndroid Build Coastguard Workerbool WaitForStateChange(grpc_connectivity_state source_state, gpr_timespec deadline); 138*cc02d7e2SAndroid Build Coastguard Worker``` 139*cc02d7e2SAndroid Build Coastguard Worker 140*cc02d7e2SAndroid Build Coastguard Workerwhich returns `true` when the state is something other than the 141*cc02d7e2SAndroid Build Coastguard Worker`source_state` and `false` if the deadline expires. Asynchronous- and futures-based 142*cc02d7e2SAndroid Build Coastguard WorkerAPIs should have a corresponding method that allows the application to be 143*cc02d7e2SAndroid Build Coastguard Workernotified when the state of a channel changes. 144*cc02d7e2SAndroid Build Coastguard Worker 145*cc02d7e2SAndroid Build Coastguard WorkerNote that a notification is delivered every time there is a transition from any 146*cc02d7e2SAndroid Build Coastguard Workerstate to any *other* state. On the other hand the rules for legal state 147*cc02d7e2SAndroid Build Coastguard Workertransition, require a transition from CONNECTING to TRANSIENT_FAILURE and back 148*cc02d7e2SAndroid Build Coastguard Workerto CONNECTING for every recoverable failure, even if the corresponding 149*cc02d7e2SAndroid Build Coastguard Workerexponential backoff requires no wait before retry. The combined effect is that 150*cc02d7e2SAndroid Build Coastguard Workerthe application may receive state change notifications that appear spurious. 151*cc02d7e2SAndroid Build Coastguard Workere.g., an application waiting for state changes on a channel that is CONNECTING 152*cc02d7e2SAndroid Build Coastguard Workermay receive a state change notification but find the channel in the same 153*cc02d7e2SAndroid Build Coastguard WorkerCONNECTING state on polling for current state because the channel may have 154*cc02d7e2SAndroid Build Coastguard Workerspent infinitesimally small amount of time in the TRANSIENT_FAILURE state. 155