by Marko Riedel, with an idea by Alexander Malmberg
The goal is to implement a screen grab utility for X11 using GNUstep. This application, or tool rather, should collect data for all windows that are currently visible, be it partially or complete, into a tree that represents the parent-child relationship between them. The most important item per window is the image that it displays. Trees map naturally onto browsers, so we will let the user navigate the window tree with an NSBrowser. The application has one window, whose upper portion displays the browser. The lower portion contains a scrollview, whose document view is a special imageview that shows the image in the window at the time the snapshot was taken (just after launch). We read image data only for the onscreen rectangle of the window. The user can save images to TIFF files. There are two phases to this application. First it queries the X server for the contents of the onscreen windows and builds a window tree; the user views and perhaps saves window contents of the retrieved windows during the second phase.
We’ll be working with three classes. The class XWinData encapsulates the attributes for an X window and an NSImage, which holds the current image obtained for the window. The class ImageView is a simple replacement of NSImageView and it provides for fast display of an image that is represented by a bitmap and it will be the documentview of the scrollview. The third class is the class Controller, which collects the data from the X server, builds the tree of XWinData objects starting with the root window and acts as a passive delegate for the browser. It implements a method to save images.
The program starts with the necessary headers, notably the one for Xlib.
The class XWinData encapsulates data that describe an X window as well as the image contained in the window. An X window is on a certain display and has a parent. We record these for easy reference. We also record the attributes of the window. We’ll be using the location and the size attributes, as well as the map state of the window (whether it is viewable or not). The instance variable children makes XWinData into a tree structure, with the individual objects being the nodes. We need the tree data to make the set of windows easy to navigate and provide data for the browser.
On of the first things that we’ll do is compute the portion of the window that is actually on the screen, in screen coordinates. We only process windows of which at least some part is on the screen. An XWinData object also holds a description of itself, which is the name of the window if it has one, and the geometry otherwise. Finally we have the image, which is an NSImage object that holds the image that we retrieved for the window.
The intializer is straightforward. Note that it doesn’t require the array of children or the image, which are computed later in the program, and only if necessary.
We need access to the attributes of the window. The methods for this return instance variables. The method onScreen indicates whether any portion at all of the window is on the screen.
The next two methods let us set and retrieve an array of children for the receiver.
The entries of the browser’s columns should be in alphabetical order so we need a comparison method between two XWinData instances that we’ll use to sort them.
If it turns out that some part of a window is on the screen, possibly obscured, then we compute an image from the contents of the window. There is an accessor for this instance variable.
The last method sends appropriate release messages to the array of children and the description (a string) when the object is being deallocated.
We now discuss these methods in detail. The initializer stores its arguments in the appropriate instance variables. It then tries to obtain the window’s attributes from the server and raises an exception if it fails.
The next step is to translate the coordinates of the window’s origin into screen coordiates (the values in the attribute structure are in the parent’s coordinate system).
We may compute the portion of the window that is on the screen once the coordinates have been translated. We intersect the screen rectangle with the window’s rectangle for this purpose. We’ll be using the result when we retrieve the image and decide whether the window should be included in the tree.
Next the initializer builds the description and then it exits. If the name property of the window is set, then we use it as the description. Otherwise we record that there was no name, namely by giving the root window the name ROOT WINDOW and other unnamed windows the name {no title}. The description of unnamed windows includes the geometry of the window.
The newly initialized object is put into an autorelease pool.
The methods description, map_state and xclass return the appropriate instance variables.
A window is on the screen if its onscreen-rectangle is not empty.
We need to set and retrieve the array of children. Setting the array will occur at most once, so we don’t have to worry about releasing a prior content of the instance variable.
XWinData objects are ordered according to the description string for sorting purposes i.e. getting the right order in the browser’s columns.
The next method is perhaps the most important of the entire application. It retrieves the window’s image from the window server and stores it in an NSImage. We’ll be using a bitmap as the image’s representation, so we can write data directly into the bitmap’s data plane. (We’ll use a non-planar bitmap with one plane.)
The first step is to translate the screen coordinates of the onscreen rectangle into window coordinates, so that we can reference the portion of the image that we wish to read. We translate the coordinates from root window coordinates to window coordinates and use the clipped bounds rectangle to obtain the dimensions of the displayed rectangle.
We are now ready to read the image data and invoke XGetImage for this purpose. We raise an exception if we couldn’t read the image.
We must convert the data in the XImage into a bitmap image representation, which we now declare. It has the correct dimensions and uses eight bits per sample (red, green, blue). There is no alpha channel. We retrieve the bitmap’s data plane for easy reference. There is only one plane because the bitmap is not planar.
We briefly digress to explain how we obtain the colors of the pixels of the image. There is a function XGetPixel, which we can use to read the pixels of the image, which are unsigned long integers. If we are not on a TrueColor visual, then we obtain the color components for this pixel using the window’s color map with a call to XQueryColor, otherwise we extract the bits for each color from the pixel’s value. Once we have these, it is easy to write them into the bitmap data plane.
We do TrueColor visuals first. The visual contains a bit mask for each color component. We must determine where in the pixel value the component starts, and how many bits it takes up. Therefore we declare a “shift” variable (offset, i.e. position) and a “bits” variable (number of bits) for each color component.
We apply the following procedure to each mask. First shift the mask to the right as long as the lowest bit is zero. The number of shifts indicates the position of the mask in the pixel. Next shift the mask to the right while the lowest bit is one. The number of these shifts yields the number of bits of the color component. We use at most eight bits. If there are more than eight bits, then we increment the position counter to skip over the least significant bits so that only the eight most significant bits remain.