1<?xml version='1.0' encoding='utf-8' ?> 2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [ 3<!ENTITY % BOOK_ENTITIES SYSTEM "Wayland.ent"> 4%BOOK_ENTITIES; 5]> 6<chapter id="chap-Protocol"> 7 <title>Wayland Protocol and Model of Operation</title> 8 <section id="sect-Protocol-Basic-Principles"> 9 <title>Basic Principles</title> 10 <para> 11 The Wayland protocol is an asynchronous object oriented protocol. All 12 requests are method invocations on some object. The requests include 13 an object ID that uniquely identifies an object on the server. Each 14 object implements an interface and the requests include an opcode that 15 identifies which method in the interface to invoke. 16 </para> 17 <para> 18 The protocol is message-based. A message sent by a client to the server 19 is called request. A message from the server to a client is called event. 20 A message has a number of arguments, each of which has a certain type (see 21 <xref linkend="sect-Protocol-Wire-Format"/> for a list of argument types). 22 </para> 23 <para> 24 Additionally, the protocol can specify <type>enum</type>s which associate 25 names to specific numeric enumeration values. These are primarily just 26 descriptive in nature: at the wire format level enums are just integers. 27 But they also serve a secondary purpose to enhance type safety or 28 otherwise add context for use in language bindings or other such code. 29 This latter usage is only supported so long as code written before these 30 attributes were introduced still works after; in other words, adding an 31 enum should not break API, otherwise it puts backwards compatibility at 32 risk. 33 </para> 34 <para> 35 <type>enum</type>s can be defined as just a set of integers, or as 36 bitfields. This is specified via the <type>bitfield</type> boolean 37 attribute in the <type>enum</type> definition. If this attribute is true, 38 the enum is intended to be accessed primarily using bitwise operations, 39 for example when arbitrarily many choices of the enum can be ORed 40 together; if it is false, or the attribute is omitted, then the enum 41 arguments are a just a sequence of numerical values. 42 </para> 43 <para> 44 The <type>enum</type> attribute can be used on either <type>uint</type> 45 or <type>int</type> arguments, however if the <type>enum</type> is 46 defined as a <type>bitfield</type>, it can only be used on 47 <type>uint</type> args. 48 </para> 49 <para> 50 The server sends back events to the client, each event is emitted from 51 an object. Events can be error conditions. The event includes the 52 object ID and the event opcode, from which the client can determine 53 the type of event. Events are generated both in response to requests 54 (in which case the request and the event constitutes a round trip) or 55 spontaneously when the server state changes. 56 </para> 57 <para> 58 <itemizedlist> 59 <listitem> 60 <para> 61 State is broadcast on connect, events are sent 62 out when state changes. Clients must listen for 63 these changes and cache the state. 64 There is no need (or mechanism) to query server state. 65 </para> 66 </listitem> 67 <listitem> 68 <para> 69 The server will broadcast the presence of a number of global objects, 70 which in turn will broadcast their current state. 71 </para> 72 </listitem> 73 </itemizedlist> 74 </para> 75 </section> 76 <section id="sect-Protocol-Code-Generation"> 77 <title>Code Generation</title> 78 <para> 79 The interfaces, requests and events are defined in 80 <filename>protocol/wayland.xml</filename>. 81 This xml is used to generate the function prototypes that can be used by 82 clients and compositors. 83 </para> 84 <para> 85 The protocol entry points are generated as inline functions which just 86 wrap the <function>wl_proxy_*</function> functions. The inline functions aren't 87 part of the library ABI and language bindings should generate their 88 own stubs for the protocol entry points from the xml. 89 </para> 90 </section> 91 <section id="sect-Protocol-Wire-Format"> 92 <title>Wire Format</title> 93 <para> 94 The protocol is sent over a UNIX domain stream socket, where the endpoint 95 usually is named <systemitem class="service">wayland-0</systemitem> 96 (although it can be changed via <emphasis>WAYLAND_DISPLAY</emphasis> 97 in the environment). Beginning in Wayland 1.15, implementations can 98 optionally support server socket endpoints located at arbitrary 99 locations in the filesystem by setting <emphasis>WAYLAND_DISPLAY</emphasis> 100 to the absolute path at which the server endpoint listens. 101 </para> 102 <para> 103 Every message is structured as 32-bit words; values are represented in the 104 host's byte-order. The message header has 2 words in it: 105 <itemizedlist> 106 <listitem> 107 <para> 108 The first word is the sender's object ID (32-bit). 109 </para> 110 </listitem> 111 <listitem> 112 <para> 113 The second has 2 parts of 16-bit. The upper 16-bits are the message 114 size in bytes, starting at the header (i.e. it has a minimum value of 8).The lower is the request/event opcode. 115 </para> 116 </listitem> 117 </itemizedlist> 118 The payload describes the request/event arguments. Every argument is always 119 aligned to 32-bits. Where padding is required, the value of padding bytes is 120 undefined. There is no prefix that describes the type, but it is 121 inferred implicitly from the xml specification. 122 </para> 123 <para> 124 125 The representation of argument types are as follows: 126 <variablelist> 127 <varlistentry> 128 <term>int</term> 129 <term>uint</term> 130 <listitem> 131 <para> 132 The value is the 32-bit value of the signed/unsigned 133 int. 134 </para> 135 </listitem> 136 </varlistentry> 137 <varlistentry> 138 <term>fixed</term> 139 <listitem> 140 <para> 141 Signed 24.8 decimal numbers. It is a signed decimal type which 142 offers a sign bit, 23 bits of integer precision and 8 bits of 143 decimal precision. This is exposed as an opaque struct with 144 conversion helpers to and from double and int on the C API side. 145 </para> 146 </listitem> 147 </varlistentry> 148 <varlistentry> 149 <term>string</term> 150 <listitem> 151 <para> 152 Starts with an unsigned 32-bit length (including null terminator), 153 followed by the string contents, including terminating null byte, 154 then padding to a 32-bit boundary. A null value is represented 155 with a length of 0. 156 </para> 157 </listitem> 158 </varlistentry> 159 <varlistentry> 160 <term>object</term> 161 <listitem> 162 <para> 163 32-bit object ID. A null value is represented with an ID of 0. 164 </para> 165 </listitem> 166 </varlistentry> 167 <varlistentry> 168 <term>new_id</term> 169 <listitem> 170 <para> 171 The 32-bit object ID. Generally, the interface used for the new 172 object is inferred from the xml, but in the case where it's not 173 specified, a new_id is preceded by a <code>string</code> specifying 174 the interface name, and a <code>uint</code> specifying the version. 175 </para> 176 </listitem> 177 </varlistentry> 178 <varlistentry> 179 <term>array</term> 180 <listitem> 181 <para> 182 Starts with 32-bit array size in bytes, followed by the array 183 contents verbatim, and finally padding to a 32-bit boundary. 184 </para> 185 </listitem> 186 </varlistentry> 187 <varlistentry> 188 <term>fd</term> 189 <listitem> 190 <para> 191 The file descriptor is not stored in the message buffer, but in 192 the ancillary data of the UNIX domain socket message (msg_control). 193 </para> 194 </listitem> 195 </varlistentry> 196 </variablelist> 197 </para> 198 <para> 199 The protocol does not specify the exact position of the ancillary data 200 in the stream, except that the order of file descriptors is the same as 201 the order of messages and <code>fd</code> arguments within messages on 202 the wire. 203 </para> 204 <para> 205 In particular, it means that any byte of the stream, even the message 206 header, may carry the ancillary data with file descriptors. 207 </para> 208 <para> 209 Clients and compositors should queue incoming data until they have 210 whole messages to process, as file descriptors may arrive earlier 211 or later than the corresponding data bytes. 212 </para> 213 </section> 214 <xi:include href="ProtocolInterfaces.xml" xmlns:xi="http://www.w3.org/2001/XInclude"/> 215 <section id="sect-Protocol-Versioning"> 216 <title>Versioning</title> 217 <para> 218 Every interface is versioned and every protocol object implements a 219 particular version of its interface. For global objects, the maximum 220 version supported by the server is advertised with the global and the 221 actual version of the created protocol object is determined by the 222 version argument passed to wl_registry.bind(). For objects that are 223 not globals, their version is inferred from the object that created 224 them. 225 </para> 226 <para> 227 In order to keep things sane, this has a few implications for 228 interface versions: 229 <itemizedlist> 230 <listitem> 231 <para> 232 The object creation hierarchy must be a tree. Otherwise, 233 inferring object versions from the parent object becomes a much 234 more difficult to properly track. 235 </para> 236 </listitem> 237 <listitem> 238 <para> 239 When the version of an interface increases, so does the version 240 of its parent (recursively until you get to a global interface) 241 </para> 242 </listitem> 243 <listitem> 244 <para> 245 A global interface's version number acts like a counter for all 246 of its child interfaces. Whenever a child interface gets 247 modified, the global parent's interface version number also 248 increases (see above). The child interface then takes on the 249 same version number as the new version of its parent global 250 interface. 251 </para> 252 </listitem> 253 </itemizedlist> 254 </para> 255 <para> 256 To illustrate the above, consider the wl_compositor interface. It 257 has two children, wl_surface and wl_region. As of wayland version 258 1.2, wl_surface and wl_compositor are both at version 3. If 259 something is added to the wl_region interface, both wl_region and 260 wl_compositor will get bumpped to version 4. If, afterwards, 261 wl_surface is changed, both wl_compositor and wl_surface will be at 262 version 5. In this way the global interface version is used as a 263 sort of "counter" for all of its child interfaces. This makes it 264 very simple to know the version of the child given the version of its 265 parent. The child is at the highest possible interface version that 266 is less than or equal to its parent's version. 267 </para> 268 <para> 269 It is worth noting a particular exception to the above versioning 270 scheme. The wl_display (and, by extension, wl_registry) interface 271 cannot change because it is the core protocol object and its version 272 is never advertised nor is there a mechanism to request a different 273 version. 274 </para> 275 </section> 276 <section id="sect-Protocol-Connect-Time"> 277 <title>Connect Time</title> 278 <para> 279 There is no fixed connection setup information, the server emits 280 multiple events at connect time, to indicate the presence and 281 properties of global objects: outputs, compositor, input devices. 282 </para> 283 </section> 284 <section id="sect-Protocol-Security-and-Authentication"> 285 <title>Security and Authentication</title> 286 <para> 287 <itemizedlist> 288 <listitem> 289 <para> 290 mostly about access to underlying buffers, need new drm auth 291 mechanism (the grant-to ioctl idea), need to check the cmd stream? 292 </para> 293 </listitem> 294 <listitem> 295 <para> 296 getting the server socket depends on the compositor type, could 297 be a system wide name, through fd passing on the session dbus. 298 or the client is forked by the compositor and the fd is 299 already opened. 300 </para> 301 </listitem> 302 </itemizedlist> 303 </para> 304 </section> 305 <section id="sect-Protocol-Creating-Objects"> 306 <title>Creating Objects</title> 307 <para> 308 Each object has a unique ID. The IDs are allocated by the entity 309 creating the object (either client or server). IDs allocated by the 310 client are in the range [1, 0xfeffffff] while IDs allocated by the 311 server are in the range [0xff000000, 0xffffffff]. The 0 ID is 312 reserved to represent a null or non-existent object. 313 314 For efficiency purposes, the IDs are densely packed in the sense that 315 the ID N will not be used until N-1 has been used. Any ID allocation 316 algorithm that does not maintain this property is incompatible with 317 the implementation in libwayland. 318 </para> 319 </section> 320 <section id="sect-Protocol-Compositor"> 321 <title>Compositor</title> 322 <para> 323 The compositor is a global object, advertised at connect time. 324 </para> 325 <para> 326 See <xref linkend="protocol-spec-wl_compositor"/> for the 327 protocol description. 328 </para> 329 </section> 330 <section id="sect-Protocol-Surface"> 331 <title>Surfaces</title> 332 <para> 333 A surface manages a rectangular grid of pixels that clients create 334 for displaying their content to the screen. Clients don't know 335 the global position of their surfaces, and cannot access other 336 clients' surfaces. 337 </para> 338 <para> 339 Once the client has finished writing pixels, it 'commits' the 340 buffer; this permits the compositor to access the buffer and read 341 the pixels. When the compositor is finished, it releases the 342 buffer back to the client. 343 </para> 344 <para> 345 See <xref linkend="protocol-spec-wl_surface"/> for the protocol 346 description. 347 </para> 348 </section> 349 <section id="sect-Protocol-Input"> 350 <title>Input</title> 351 <para> 352 A seat represents a group of input devices including mice, 353 keyboards and touchscreens. It has a keyboard and pointer 354 focus. Seats are global objects. Pointer events are delivered 355 in surface-local coordinates. 356 </para> 357 <para> 358 The compositor maintains an implicit grab when a button is 359 pressed, to ensure that the corresponding button release 360 event gets delivered to the same surface. But there is no way 361 for clients to take an explicit grab. Instead, surfaces can 362 be mapped as 'popup', which combines transient window semantics 363 with a pointer grab. 364 </para> 365 <para> 366 To avoid race conditions, input events that are likely to 367 trigger further requests (such as button presses, key events, 368 pointer motions) carry serial numbers, and requests such as 369 wl_surface.set_popup require that the serial number of the 370 triggering event is specified. The server maintains a 371 monotonically increasing counter for these serial numbers. 372 </para> 373 <para> 374 Input events also carry timestamps with millisecond granularity. 375 Their base is undefined, so they can't be compared against 376 system time (as obtained with clock_gettime or gettimeofday). 377 They can be compared with each other though, and for instance 378 be used to identify sequences of button presses as double 379 or triple clicks. 380 </para> 381 <para> 382 See <xref linkend="protocol-spec-wl_seat"/> for the 383 protocol description. 384 </para> 385 <para> 386 Talk about: 387 388 <itemizedlist> 389 <listitem> 390 <para> 391 keyboard map, change events 392 </para> 393 </listitem> 394 <listitem> 395 <para> 396 xkb on Wayland 397 </para> 398 </listitem> 399 <listitem> 400 <para> 401 multi pointer Wayland 402 </para> 403 </listitem> 404 </itemizedlist> 405 </para> 406 <para> 407 A surface can change the pointer image when the surface is the pointer 408 focus of the input device. Wayland doesn't automatically change the 409 pointer image when a pointer enters a surface, but expects the 410 application to set the cursor it wants in response to the pointer 411 focus and motion events. The rationale is that a client has to manage 412 changing pointer images for UI elements within the surface in response 413 to motion events anyway, so we'll make that the only mechanism for 414 setting or changing the pointer image. If the server receives a request 415 to set the pointer image after the surface loses pointer focus, the 416 request is ignored. To the client this will look like it successfully 417 set the pointer image. 418 </para> 419 <para> 420 Setting the pointer image to NULL causes the cursor to be hidden. 421 </para> 422 <para> 423 The compositor will revert the pointer image back to a default image 424 when no surface has the pointer focus for that device. 425 </para> 426 <para> 427 What if the pointer moves from one window which has set a special 428 pointer image to a surface that doesn't set an image in response to 429 the motion event? The new surface will be stuck with the special 430 pointer image. We can't just revert the pointer image on leaving a 431 surface, since if we immediately enter a surface that sets a different 432 image, the image will flicker. If a client does not set a pointer image 433 when the pointer enters a surface, the pointer stays with the image set 434 by the last surface that changed it, possibly even hidden. Such a client 435 is likely just broken. 436 </para> 437 </section> 438 <section id="sect-Protocol-Output"> 439 <title>Output</title> 440 <para> 441 An output is a global object, advertised at connect time or as it 442 comes and goes. 443 </para> 444 <para> 445 See <xref linkend="protocol-spec-wl_output"/> for the protocol 446 description. 447 </para> 448 <para> 449 </para> 450 <itemizedlist> 451 <listitem> 452 <para> 453 laid out in a big (compositor) coordinate system 454 </para> 455 </listitem> 456 <listitem> 457 <para> 458 basically xrandr over Wayland 459 </para> 460 </listitem> 461 <listitem> 462 <para> 463 geometry needs position in compositor coordinate system 464 </para> 465 </listitem> 466 <listitem> 467 <para> 468 events to advertise available modes, requests to move and change 469 modes 470 </para> 471 </listitem> 472 </itemizedlist> 473 </section> 474 <section id="sect-Protocol-data-sharing"> 475 <title>Data sharing between clients</title> 476 <para> 477 The Wayland protocol provides clients a mechanism for sharing 478 data that allows the implementation of copy-paste and 479 drag-and-drop. The client providing the data creates a 480 <function>wl_data_source</function> object and the clients 481 obtaining the data will see it as <function>wl_data_offer</function> 482 object. This interface allows the clients to agree on a mutually 483 supported mime type and transfer the data via a file descriptor 484 that is passed through the protocol. 485 </para> 486 <para> 487 The next section explains the negotiation between data source and 488 data offer objects. <xref linkend="sect-Protocol-data-sharing-devices"/> 489 explains how these objects are created and passed to different 490 clients using the <function>wl_data_device</function> interface 491 that implements copy-paste and drag-and-drop support. 492 </para> 493 <para> 494 See <xref linkend="protocol-spec-wl_data_offer"/>, 495 <xref linkend="protocol-spec-wl_data_source"/>, 496 <xref linkend="protocol-spec-wl_data_device"/> and 497 <xref linkend="protocol-spec-wl_data_device_manager"/> for 498 protocol descriptions. 499 </para> 500 <para> 501 MIME is defined in RFC's 2045-2049. A 502 <ulink url="https://www.iana.org/assignments/media-types/media-types.xhtml"> 503 registry of MIME types</ulink> is maintained by the Internet Assigned 504 Numbers Authority (IANA). 505 </para> 506 <section> 507 <title>Data negotiation</title> 508 <para> 509 A client providing data to other clients will create a <function>wl_data_source</function> 510 object and advertise the mime types for the formats it supports for 511 that data through the <function>wl_data_source.offer</function> 512 request. On the receiving end, the data offer object will generate one 513 <function>wl_data_offer.offer</function> event for each supported mime 514 type. 515 </para> 516 <para> 517 The actual data transfer happens when the receiving client sends a 518 <function>wl_data_offer.receive</function> request. This request takes 519 a mime type and a file descriptor as arguments. This request will generate a 520 <function>wl_data_source.send</function> event on the sending client 521 with the same arguments, and the latter client is expected to write its 522 data to the given file descriptor using the chosen mime type. 523 </para> 524 </section> 525 <section id="sect-Protocol-data-sharing-devices"> 526 <title>Data devices</title> 527 <para> 528 Data devices glue data sources and offers together. A data device is 529 associated with a <function>wl_seat</function> and is obtained by the clients using the 530 <function>wl_data_device_manager</function> factory object, which is also responsible for 531 creating data sources. 532 </para> 533 <para> 534 Clients are informed of new data offers through the 535 <function>wl_data_device.data_offer</function> event. After this 536 event is generated the data offer will advertise the available mime 537 types. New data offers are introduced prior to their use for 538 copy-paste or drag-and-drop. 539 </para> 540 <section> 541 <title>Selection</title> 542 <para> 543 Each data device has a selection data source. Clients create a data 544 source object using the device manager and may set it as the 545 current selection for a given data device. Whenever the current 546 selection changes, the client with keyboard focus receives a 547 <function>wl_data_device.selection</function> event. This event is 548 also generated on a client immediately before it receives keyboard 549 focus. 550 </para> 551 <para> 552 The data offer is introduced with 553 <function>wl_data_device.data_offer</function> event before the 554 selection event. 555 </para> 556 </section> 557 <section> 558 <title>Drag and Drop</title> 559 <para> 560 A drag-and-drop operation is started using the 561 <function>wl_data_device.start_drag</function> request. This 562 requests causes a pointer grab that will generate enter, motion and 563 leave events on the data device. A data source is supplied as 564 argument to start_drag, and data offers associated with it are 565 supplied to clients surfaces under the pointer in the 566 <function>wl_data_device.enter</function> event. The data offer 567 is introduced to the client prior to the enter event with the 568 <function>wl_data_device.data_offer</function> event. 569 </para> 570 <para> 571 Clients are expected to provide feedback to the data sending client 572 by calling the <function>wl_data_offer.accept</function> request with 573 a mime type it accepts. If none of the advertised mime types is 574 supported by the receiving client, it should supply NULL to the 575 accept request. The accept request causes the sending client to 576 receive a <function>wl_data_source.target</function> event with the 577 chosen mime type. 578 </para> 579 <para> 580 When the drag ends, the receiving client receives a 581 <function>wl_data_device.drop</function> event at which it is expected 582 to transfer the data using the 583 <function>wl_data_offer.receive</function> request. 584 </para> 585 </section> 586 </section> 587 </section> 588</chapter> 589