Network Working Group Robert W. Scheifler Request for Comments: 1013 June 1987 X WINDOW SYSTEM PROTOCOL, VERSION 11 Alpha Update April 1987 Copyright (c) 1986, 1987 Massachusetts Institute of Technology X Window System is a trademark of M.I.T. Status of this Memo This RFC is distributed to the Internet community for information only. It does not establish an Internet standard. The X window system has been widely reviewed and tested. The internet community is encouraged to experiment with it. Distribution of this memo is unlimited (see copyright notice on page 2). M.I.T. [Page 1] RFC 1013 June 1987 Permission to use, copy, modify, and distribute this document for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice are retained, and that the name of M.I.T. not be used in advertising or publicity pertaining to this document without specific, written prior permission. M.I.T. makes no representations about the suitability of this document or the protocol defined in this document for any purpose. It is provided "as is" without express or implied warranty. Author: Robert W. Scheifler Laboratory for Computer Science 545 Technology Square, Room 418 Cambridge, MA 02139 Contributors: Dave Carver (Digital HPW) Branko Gerovac (Digital HPW) Jim Gettys (MIT/Project Athena, Digital) Phil Karlton (Digital WSL) Scott McGregor (Digital SSG) Ram Rao (Digital UEG) David Rosenthal (Sun) Dave Winchell (Digital UEG) Implementors of initial server who provided useful input: Susan Angebranndt (Digital) Raymond Drewry (Digital) Todd Newman (Digital) Invited reviewers who provided useful input: Andrew Cherenson (Berkeley) Burns Fisher (Digital) Dan Garfinkel (HP) Leo Hourvitz (Next) Brock Krizan (HP) David Laidlaw (Stellar) Dave Mellinger (Interleaf) Ron Newman (MIT) John Ousterhout (Berkeley) Andrew Palay (ITC CMU) Ralph Swick (MIT) Craig Taylor (Sun) Jeffery Vroom (Stellar) This document does not attempt to provide the rationale or pragmatics required to fully understand the protocol or to place it in perspective within a complete system. Knowledge of X Version 10 will certainly aid in understanding this document. M.I.T. [Page 2] RFC 1013 June 1987 The protocol contains many management mechanisms that are not intended for normal applications. Not all mechanisms are needed to build a particular user interface. It is important to keep in mind that the protocol is intended to provide mechanism, not policy. This document does not attempt to define precise formats or bit encodings. ------------------------------------------------------------------- M.I.T. [Page 3] RFC 1013 June 1987 SECTION 1. TERMINOLOGY Access control list X maintains a list of hosts from which client programs may be run. By default, only programs on the local host may use the display, plus any hosts specified in an initial list read by the server. This "access control list" can be changed by clients on the local host. Some server implementations may also implement other authorization mechanisms. Active grab A grab is "active" when the pointer or keyboard is actually owned by the single grabbing client. Ancestors If W is an inferior of A, then A is an "ancestor" of W. Atom An "atom" is a unique id corresponding to a string name. Atoms are used to identify properties, types, and selections. Backing store When a server maintains the contents of a window, the off-screen saved pixels are known as a "backing store". Bit gravity When a window is resized, the contents of the window are not necessarily discarded. It is possible to request the server (though no guarantees are made) to relocate the previous contents to some region of the window. This attraction of window contents for some location of a window is known as "bit gravity". Bitmap A "bitmap" is a pixmap of depth one. Button grabbing Buttons on the pointer may be passively "grabbed" by a client. When the button is pressed, the pointer is then actively grabbed by the client. Byte order For image (pixmap/bitmap) data, byte order is defined by the server, and clients with different native byte ordering must swap bytes as necessary. For all other parts of the protocol, the byte order is defined by the client, and the server swaps bytes as necessary. Children The "children" of a window are its first-level subwindows. M.I.T. [Page 4] RFC 1013 June 1987 Client An application program connects to the window system server by some interprocess communication (IPC) path, such as a TCP connection or a shared memory buffer. This program is the window system server. More precisely, the client is the IPC path itself; a program with multiple paths open to the server is viewed as multiple clients by the protocol. Resource lifetimes are controlled by connection lifetimes, not by program lifetimes. Clipping regions In a graphics context, a bitmap or list of rectangles can be specified to restrict output to a particular region of the window. The image defined by the bitmap or rectangles is called a "clipping region". Color cell An entry in a colormap is known as a "color cell". An entry contains three values specifying red, green and blue intensities. These values are always viewed as 16 bit unsigned numbers, with zero being minimum intensity. The values are scaled by the server to match the display hardware. The components of a cell are coincident with components of other cells in DirectColor and TrueColor colormaps. Colormap A "colormap" consists of a set of color cells. A pixel value indexes the color map to produce intensities to be displayed. Depending on hardware limitations, one or more colormaps may be installed at one time, such that windows associated with those maps display with true colors. Connection The IPC path between the server and client program is known as a "connection". A client program typically (but not necessarily) has one connection to the server over which requests and events are sent. Containment A window "contains" the pointer if the window is viewable and the hotspot of the cursor is within a visible region of the window or a visible region of one of its inferiors. The border of the window is included as part of the window for containment. The pointer is "in" a window if the window contains the pointer but no inferior contains the pointer. Coordinate system The coordinate system has X horizontal and Y vertical, with the origin [0, 0] at the upper left. Coordinates are discrete, and in terms of pixels. Each window and pixmap has M.I.T. [Page 5] RFC 1013 June 1987 its own coordinate system. For a window, the origin is at the inside upper left, inside the border. Cursor A "cursor" is the visible shape of the pointer on a screen. It consist of a hot spot, a source bitmap, a shape bitmap, and a pair of colors. The cursor defined for a window controls the visible appearance when the pinter is in that window. Depth The "depth" of a window or pixmap is number of bits per pixel it has. The depth of a gcontext is the depth of the root of the gcontext. Device Keyboards, mice, tablets, track-balls, button boxes, etc. are all collectively known as input "devices". The core protocol only deals with two devices, "the keyboard" and "the pointer". Drawable Both windows and pixmaps may be used as sources and destinations in graphics operations. These are collectively known as "drawables". However, an InputOnly window cannot be used as a source or destination in a graphics operation. Event Clients are informed of information asynchronously via "events". These events may be either asynchronously generated from devices, or generated as side effects of client requests. Events are grouped into types; events are never sent to a client by the server unless the client has specificially asked to be informed of that type of event, but other clients can force events to be sent to other clients. Events are typically reported relative to a window. Event mask Events are requested relative to a window. The set of event types a client requests relative to a window described using an "event mask". Event sychronization There are certain race conditions possible when demultiplexing device events to clients (in particular deciding where pointer and keyboard events should be sent when in the middle of window management operations). The event synchronization mechanism allows synchronous processing of device events. M.I.T. [Page 6] RFC 1013 June 1987 Event propagation Device-related events "propagate" from the source window to ancestor windows until some client has expressed interest in handling that type of event, or until the event is discarded explicitly. Event source The smallest window containing the pointer is the "source" of a device related event. Exposure event Servers do not guarantee to preserve the contents of windows when windows are obscured or reconfigur contents of regions of windows have been lost. Extension Named "extensions" to the core protocol can be defined to extend the system. Extension to output requests, resources, and event types are all possible, and expected. Font A "font" is an array of glyphs (typically characters). The protocol does no translation or interpretation of character sets. The client simply indicates values used to index the glyph array. A font contains additional metric information to determine inter-glyph and inter-line spacing. Glyph A "glyph" is an image, typically of a character, in a font. Grab Keyboard keys, the keyboard, pointer buttons, the pointer, and the server can be "grabbed" for exclusive use by a client. In general, these facilities are not intended to be used by normal applications, but are intended for various input and window managers to implement various styles of user interfaces. Graphics context Various information for graphics output is stored in "GC"'s, such as foreground pixel, background pixel, line width, clipping region, etc. Hotspot A cursor has an associated "hot spot" which defines a point in the cursor that corresponds to the coordinates reported for the pointer. Identifier Each resource has an "identifier", a unique value associated with it that clients use to name the resource. An identifier M.I.T. [Page 7] RFC 1013 June 1987 can be used over any connection to name the resource. Inferiors The "inferiors" of a window are all of the subwindows nested below it: the children, the children's children, etc. Input focus The "input focus" is nominally where keyboard input goes. Keyboard events are by default sent to the client expressing interest on the window the pointer is in. This is said to be a "real estate driven" input focus. It is also possible to attach the keyboard input to a specific window; events will then be sent to the appropriate client independent of the pointer position. Input manager Control over keyboard input is typically provided by an "input manager" client. InputOnly window A window that cannot be used for graphics requests. InputOnly windows are "invisible", and can be used to control such things as cursors, input event generation, and grabbing. InputOutput window The "normal" kind of opaque window, used for both input and output. Key grabbing Keys on the keyboard may be passively "grabbed" by a client. When the key is pressed, the keyboard is then actively grabbed by the client. Keyboard grabbing A client can actively "grab" control of the keyboard, and key events will be sent to that client rather than the client the events would normally have been sent to. Mapping A window is said to be "mapped" if a map call has been performed on it. Unmapped windows are never viewable or visible. Modifier keys Shift, Control, Meta, Super, Hyper, ALT, Compose, Apple, CapsLock, ShiftLock, and similar keys are called "modifier" keys. Obscures Window A "obscures" window B if both are viewable InputOutput windows and A is higher in the global stacking M.I.T. [Page 8] RFC 1013 June 1987 order, and the rectangle defined by the outside edges of intersects the rectangle defined by the outside edges of B. Note the (fine) distinction with "occludes". Also note that window borders are included in the calculation. Occludes Window A "occludes" window B if both are mapped and A is higher in the global stacking order, and the rectangle defined by the outside edges of A intersects the rectangle defined by the outside edges of B. Note the (fine) distinction with "obscures". Also note that window borders are included in the calculation. Padding Some padding bytes are inserted in the data stream to maintain alignment of the protocol requests on natural boundaries. This increases ease of portability to some machine architectures. Parent window If C is a child of P, then P is the "parent" of C. Passive grab Grabbing a key or button is a "passive" grab. The grab activates when the key or button is actually pressed. Pixel value A "pixel" is an N-bit value, where N is the number of bit planes used in a particular window or pixmap. For a window, a pixel value indexes a colormap to derive an actual color to be displayed. Pixmap A "pixmap" is a three dimensional array of bits. A pixmap is normally thought of as a two dimensional array of pixels, where each pixel can be a value from 0 to (2^N)-1, where N is the depth (z axis) of the pixmap. A pixmap can also be thought of as a stack of N bitmaps. Plane mask Graphics operations can be restricted to only affect a subset of bit planes of a destination. A "plane mask" is a bit mask describing which planes are to be modified, and is stored in a graphics context. Pointer The "pointer" is the pointing device attached to the cursor, and tracked on the screens. Pointer grabbing A client can actively "grab" control of the pointer, and M.I.T. [Page 9] RFC 1013 June 1987 button and motion events will be sent to that client rather than the client the events would normally have been sent to. Pointing device A "pointing device" is typically a mouse or tablet, or some other device with effective dimensional motion. There is only one visible cursor is defined by the core protocol, and it tracks whatever pointing device is attached as the pointer. Property Windows may have associated "properties", consisting of a name, a type, a data format, and some data. The protocol places no interpretation on properties, they are intended as a general-purpose naming mechanism for clients. For example, clients might share information such as resize hints, program names, and icon formats with a window manager via properties. Property list The "property list" of a window is the list of properties that have been defined for the window. Redirecting control Window managers (or client programs) may wish to enforce window layout policy in various ways. When a client attempts to change the size or position of a window, the operation may be "redirected" to a specified client, rather than the operation actually being performed. Reply Information requested by a client program is sent back to the client with a "reply". Both events and replys are multipexed on the same connection. Most requests do not generate replies. Request A command to the server is called a "request". It is a single block of data sent over a connection. Resource Windows, pixmaps, cursors, fonts, graphics contexts, and colormaps are known as "resources". They all have unique identifiers associated with them for naming purposes. The lifetime of a resource is bounded by the lifetime of the connection over which the resource was created. Root The "root" of a pixmap or gcontext is the same as the root of whatever drawable was used when the pixmap or gcontext was created. The "root" of a window is the root window M.I.T. [Page 10] RFC 1013 June 1987 under which the window was created. Root window Each screen has a "root window" covering it. It cannot be reconfigured or unmapped, but otherwise acts as a full fledged window. A root window has no parent. Save set The "save set" of a client is a list of other client's windows which, if they are inferiors of one of the client's windows at connection close, should not be destroyed, and which should be remapped if it is unmapped. Save sets are typically used by window managers to avoid lost windows if the manager should terminate abnormally. Screen A server may provide several independent "screens", which typically have physically independent monitors. This would be the expected configuration when there is only a single keyboard and pointer shared among the screens. Server The "server" provides the basic windowing mechanism. It handles IPC connections from clients, demultipexes graphics requests onto the screens, and multiplexes input back to the appropriate clients. Server grabbing The server can be "grabbed" by a single client for exclusive use. This prevents processing of any requests from other client connections until the grab is complete. This is typically only a transient state for such things as rubber-banding and pop-up menus, or to execute requests indivisibly. Sibling Children of the same parent window are known as "sibling" windows. Stacking order Sibling windows may "stack" on top of each other. Windows above both obscure and occlude lower windows. This is similar to paper on a desk. The relationship between sibling windows is known as the "stacking order". Stipple A "stipple pattern" is a bitmap that is used to tile a region to serve as an additional clip mask for a fill operation with the foreground color. M.I.T. [Page 11] RFC 1013 June 1987 Tile A pixmap can be replicated in two dimensions to "tile" a region. The pixmap itself is also known as a "tile". Timestamp A time value, expressed in milliseconds, typically since the last server reset. Timestamp values wrap around (after about 49.7 days). The server, given its current time is represented by timestamp T, always interprets timestamps from clients by treating half of the timestamp space as being earlier in time than T, and half of the timestamp space as being later in time than T. One timestamp value (named CurrentTime) is never generated by the server; this value is reserved for use in requests to represent the current server time. Type A type is an arbitrary atom used to identify the interpretation of property data. Types are completely uninterpreted by the server; they are solely for the benefit of clients. Unviewable A window is "unviewable" if it is mapped but some ancestor is unmapped. Viewable A window is "viewable" if it and all of its ancestors are mapped. This does not imply that any portion of the window is actually visible. Visible A region of a window is "visible" if someone looking at the screen can actually "see" it: the window is viewable and the region is not occluded by any other window. Window gravity When windows are resized, subwindows may be repositioned automatically relative to some position in the window. This attraction of a subwindow to some part of its parent is known as "window gravity". Window manager Manipulation of windows on the screen, and much of the user interface (policy) is typically provided by a "window manager" client. XYFormat The data for a pixmap is said to be in "XYFormat" if it is organized as a set of bitmaps representing individual bit planes. M.I.T. [Page 12] RFC 1013 June 1987 ZFormat The data for a pixmap is said to be in "ZFormat" if it is organized as a set of pixel values in scanline order. SECTION 2. PROTOCOL FORMATS Request Format Every request contains an 8-bit "major" opcode, and a 16-bit length field expressed in units of 4 bytes. Every request consists of 4 bytes of header containing the major opcode, the length field, and a data byte) followed by zero or more additional bytes of data; the length field defines the total length of the request, including the header. The length field in a request must equal the minimum length required to contain the request; if the specified length is smaller or larger than the required length, an error is enerated. Unused bytes in a request are not required to be zero. Major opcodes 128 through 255 are reserved for extensions. Extensions are intended to contain multiple requests, so extension requests typically have an additional minor opcode encoded in the "spare" data byte in the request header, but the placement and interpretation of this minor opcode, and all other fields in extension requests, are not defined by the core protocol. Every request is implicitly assigned a sequence number, starting with one,used in replies, errors, and events. Reply Format Every reply contains a 32-bit length field expressed in units of 4 bytes. Every reply consists of 32 bytes, followed by zero or more additional bytes of data, as specified in the length field. Unused bytes within a reply are not guaranteed to be zero. Every reply also contains the least significant 16 bits of the sequence number of the corresponding request. Error Format Error reports are 32 bytes long. Every error includes an 8-bit error code. Error codes 128 through 255 are reserved for extensions. Every error also includes the major and minor opcodes of the failed request, and the least significant 16 bits of the sequence number of the request. For the following errors (see Section 5), the failing resource id is also returned: Colormap, Cursor, Drawable, Font, GContext, IDChoice, Pixmap, and Window. For Atom errors, the failing atom is returned. For Value errors, the failing value is returned. Other core errors return no additional data. Unused bytes within an error are not guaranteed to be zero. Event Format Events are 32 bytes long. Unused bytes within an event are not M.I.T. [Page 13] RFC 1013 June 1987 guaranteed to be zero. Every event contains an 8-bit type code. The most significant bit in this code is set if the event was generated from a SendEvent request. Event codes 64 through 127 are reserved for extensions, although the core protocol does not define a mechanism for selecting interest in such events. Every core event (with the exception of KeymapNotify) also contains the least significant 16 bits of the sequence number of the last request issued by the client that was (or is currently being) processed by the server. SECTION 3. SYNTAX The syntax {...} encloses a set of alternatives. The syntax [...] encloses a set of structure components. In general, TYPEs are in upper case and AlternativeValues are capitalized. Requests in Section 10 are described in the following format: RequestName arg1: type1 ... argN: typeN => result1: type1 ... resultM: typeM Errors: kind1, ..., kindK Description. If no => is present in the description, then the request has no reply (it is asynchronous), although errors may still be reported. Events in Section 12 are described in the following format: EventName value1: type1 ... valueN: typeN Description. M.I.T. [Page 14] RFC 1013 June 1987 SECTION 4. COMMON TYPES LISTofFOO A type name of the form LISTofFOO means a counted list of elements of type FOO; the size of the length field may vary (it is not necessarily the same size as a FOO), in some cases may be implicit, and is not fully specified in this document. BITMASK and LISTofVALUE The types BITMASK and LISTofVALUE are somewhat special. Various requests contain arguments of the form: value-mask: BITMASK value-list: LISTofVALUE used to allow the client to specify a subset of a heterogeneous collection of "optional" arguments. The value-mask specifies which arguments are to be provided; each such argument is assigned a unique bit position. The representation of the BITMASK will typically contain more bits than there are defined arguments; unused bits in the value-mask must be zero (or the server generates a Value error). The value-list contains one value for each one bit in the mask, from least to most significant bit in the mask. Each value is represented with 4 bytes, but the actual value occupies only the least significant bytes as required; the values of the unused bytes do not matter. Or Types A type of the form "T1 or ... or Tn" means the union of the indicated types; a single-element type is given as the element without enclosing braces. DEVICE: 32-bit id ( 8 bits each) WINDOW: 32-bit id PIXMAP: 32-bit id CURSOR: 32-bit id FONT: 32-bit id GCONTEXT: 32-bit id COLORMAP: 32-bit id DRAWABLE: WINDOW or PIXMAP ATOM: 32-bit id (top 3 bits guaranteed to be zero) VISUALID: 32-bit id (top 3 bits guaranteed to be zero) VALUE: 32-bit quantity (used only in LISTofVALUE) INT8: 8-bit signed integer INT16: 16-bit signed integer INT32: 32-bit signed integer CARD8: 8-bit unsigned integer CARD16: 16-bit unsigned integer CARD32: 32-bit unsigned integer M.I.T. [Page 15] RFC 1013 June 1987 TIMESTAMP: CARD32 BITGRAVITY: {Forget, Static, NorthWest, North, NorthEast, West, Center, East, SouthWest, South, SouthEast} WINGRAVITY: {Unmap, Static, NorthWest, North, NorthEast, West, Center, East, SouthWest, South, SouthEast} BOOL: {True, False} EVENT: {KeyPress, KeyRelease, OwnerGrabButton, ButtonPress, ButtonRelease, EnterWindow, LeaveWindow, PointerMotion, PointerMotionHint, Button1Motion, Button2Motion, Button3Motion, Button4Motion, Button5Motion, ButtonMotion Exposure, VisibilityChange, StructureNotify, ResizeRedirect, SubstructureNotify, SubstructureRedirect, FocusChange, PropertyChange, ColormapChange, KeymapState} POINTEREVENT: {ButtonPress, ButtonRelease, EnterWindow, LeaveWindow, PointerMotion, PointerMotionHint, Button1Motion, Button2Motion, Button3Motion, Button4Motion, Button5Motion, ButtonMotion KeymapState} DEVICEEVENT: {KeyPress, KeyRelease, ButtonPress, ButtonRelease, PointerMotion, Button1Motion, Button2Motion, Button3Motion, Button4Motion, Button5Motion, ButtonMotion} KEYCODE: CARD8 BUTTON: CARD8 KEYMASK: {Shift, CapsLock, Control, Mod1, Mod2, Mod3, Mod4, Mod5} BUTMASK: {Button1, Button2, Button3, Button4, Button5} KEYBUTMASK: KEYMASK or BUTMASK STRING8: LISTofCARD8 STRING16: LISTofCHAR2B CHAR2B: [byte1, byte2: CARD8] POINT: [x, y: INT16] RECTANGLE: [x, y: INT16, width, height: CARD16] ARC: [x, y: INT16, width, height: CARD16, angle1, angle2: INT16] HOST: [family: {Internet, NS, ECMA, Datakit, DECnet} address: LISTofCARD8] The [x,y] coordinates of a RECTANGLE specify the upper left corner. M.I.T. [Page 16] RFC 1013 June 1987 The primary interpretation of "large" characters in a STRING16 is that they are composed of two bytes used to index a 2-D matrix; hence the use of CHAR2B rather than CARD16. This corresponds to the JIS/ISO method of indexing two-byte characters. It is expected that most "large" fonts will be defined with two-byte matrix indexing. For large fonts constructed with linear indexing, a CHAR2B can be interpreted as a 16-bit number by treating byte1 as the most significant byte; this means that clients should always transmit such 16-bit character values most significant byte first, as the server will never byte-swap CHAR2B quantities. The length, format, and interpretation of a HOST address are specific to the family. SECTION 5. ERRORS In general, when a request terminates with an error, the request has no side effects (i.e., there is no partial execution). The only requests for which this is not true are ChangeWindowAttributes, ChangeGC, PolyText8, PolyText16, FreeColors, StoreColors, and ChangeKeyboardControl. The following error codes can be returned by the various requests: Access An attempt to grab a key/button combination already grabbed by another client. An attempt to free a colormap entry not allocated by the client. An attempt to store into a read-only or an unallocated colormap entry. An attempt to modify the access control list from other than the local (or otherwise authorized) host. An attempt to select an event type, that at most one client can select at a time, when another client has already selected it. Alloc The server failed to allocate the requested resource. Note that this only covers allocation errors at a very coarse level, and is not intended to (nor can it in practice hope to) cover all cases of a server running out of allocation space in the middle of service. M.I.T. [Page 17] RFC 1013 June 1987 The semantics when a server runs out of allocation space are left unspecified. Atom A value for an ATOM argument does not name a defined ATOM. Colormap A value for a COLORMAP argument does not name a defined COLORMAP. Cursor A value for a CURSOR argument does not name a defined CURSOR. Drawable A value for a DRAWABLE argument does not name a defined WINDOW or PIXMAP. Font A value for a FONT or argument does not name a defined FONT. GContext A value for a GCONTEXT argument does not name a defined GCONTEXT. IDChoice The value chosen for a resource identifier is either not included in the range assigned to the client, or is already in use. Implementation The server does not implement some aspect of the request. A server which generates this error for a core request is deficient. As such, this error is not listed for any of the requests, but clients should be prepared to receive such errors, and handle or discard them. Length The length of a request is shorter or longer than that required to minimally contain the arguments. Match An InputOnly window is used as a DRAWABLE. Some argument (or pair of arguments) has the correct type and range, but fails to "match" in some other way required by the request. Name A font or color of the specified name does not exist. M.I.T. [Page 18] RFC 1013 June 1987 Pixmap A value for a PIXMAP argument does not name a defined PIXMAP. Property The requested property does not exist for the specified window. Request The major or minor opcode does not specify a valid request. Value Some numeric value falls outside the range of values accepted by the request. Unless a specific range is specified for an argument, the full range defined by the argument's type is accepted. Any argument defined as a set of alternatives can generate this error. Window A value for a WINDOW argument does not name a defined WINDOW. Note: the Atom, Colormap, Cursor, Drawable, Font, GContext, Pixmap, and Window errors are also used when the argument type is extended by union with a set of fixed alternatives, e.g.,. SECTION 6. KEYBOARDS Keycodes are always in the inclusive range [8,255]. For keyboards with both left-side and right-side modifier keys (e.g., Shift and Control), the mask bits in the protocol always define the OR of the keys. If electronically distinguishable, they can have separate up/down events generated, and clients that want to distinguish can track the individual states manually.