by Marko Riedel
This recipe implements a search and find utility with find and grep. It may be used to search a code base of, say, Objective C code for certain code snippets for use in writing an application. It provides support for coding by “copy-and-paste.”
The basic idea is to run find on a directory in order to locate a set of files that match a shell pattern or a regular expression. We run grep for each file and look for a certain pattern. Matching lines including some context are collected and displayed for viewing. The user may inspect an entire file if she wishes.
We use a subclass of NSWindow to provide these features. Every such window consists of three components: the upper component where the user enters the pattern, the regular expression, the shell pattern, the context length and extra switches for grep into two forms. The lower half of the window contains the second and third component, which are placed in an NSSplitView, namely a scrollview that displays the result of the search and a scrollview for viewing the contents of a match. There is a button between the first and the second component. The user clicks this button to execute the query. It also doubles as a progress indicator during the search, where it fills with blue starting at the left and moving to the right as the search progresses.
The results of a query are displayed with buttons and textviews. The button’s title shows what file matched and what line. The textview shows the match including the context. The program displays the entire file in the lower scrollview when the user clicks the button. The context is selected and scrolled so that it is visible. The query thus produces an alternating sequence of buttons and textviews, one pair for each match.
We need four classes to implement this recipe. The first class is a subclass of NSView called Flipped. It implements a view whose coordinate system has its origin in the upper left corner, with the y-axis extending downwards. Then there is RangeButton, a subclass of NSButton that can store a range. It is used in the view that displays the results of the query and it stores the location and the length of the range of a match. There is a controller for interfacing with the application. Finally, the class BrowseIt implements the query window described above.
Start with the usual headers. The class Flipped implents a view with a flipped coordinate system. We’ll be attaching subviews to this view. It does no drawing itself.
The class RangeButton is trivial as it only adds an instance variable to store the range and a method to retrieve it.
The method range reads the instance variable and the method setRange writes it.
The headers for the controller class and the custom window are next. The controller reads the location of the find and grep binaries from the user defaults on startup and remembers them during the lifetime of the application. This is the purpose of the two instance variables and the accessors find and grep. There is an important method that runs a command with some arguments and collects its standard output and standard error streams and returns its exit status. We will be using this method to run find and grep. There is a method that opens a new query window in response to a click on the corresponding menu item. It lets the user choose the directory to search in an openpanel.
The following series of definitions pertains to the user interface of the program and to the behavior of the custom window. We will not be displaying tabs in text views and define TABREP to be a string of spaces that replace a single tab. We define the width and height of the window with DIMENSION. We will be assembling two forms in the upper third of the window and we define an enumerated type that indicates the purpose of a formcell from those two forms. This lets us assemble the forms in a loop instead of coding every cell in turn. We define the titles of the form cells. The last define is for the point size of the fixed pitch font that we will use on buttons and in textviews.
The custom window contains many instance variables that make it easy to reference the objects in its view hierarchy. It stores the controller because it will use it to find files and grep for patterns. It also remembers what directory it is supposed to process. It stores the formcells that contain the values of the switches for find and grep. It stores the search button because it will lock focus on it to draw the progress indicator. It also stores the split view and the upper scrollview, which will hold the query results, and the lower scrollview, which is used to view entire files. The query results will be attached to the document view of the upper scrollview as they come in and the variables attachAtY and maxwidth tell us where they go (what height) and what the maximum width of a result is, respectively. We’ll be computing the necessary widths and heights of text views and what rectangle we need to scroll to for display of a match in the file viewer. That’s why we store the fixed pitch font and its bounding box.
A custom window (we will also refer to it by the term “browse window”) is initialized with the value of the directory that we will search and the controller that provides the facility to run commands and collect the output. There are two shorthand methods to set the document view of the upper and lower scrollviews.
We store the parameters of the last successful query in the user defaults, and read them back in for use as the initial values when we open a new browse window. This is the purpose of readArgs and writeArgs.
The browse window will be the delegate of the split view that it contains. We implement two methods that prevent the user from completely miniaturizing the upper or the lower scrollview.
We have a method that takes the output lines of a single successful grep on a single file and assembles the corresponding series of buttons and textviews.
There is a method to display the contents of the file represented by a button in the lower scrollview.
We will respond to a certain number of error conditions. The method errView returns a view that holds the error message and can be placed in one of the two scrollviews.
The method search: is the action of the search button. It runs find and iterates over the files that it outputs, invoking first grep and then processMatchFor:data: for each file in turn.
There is a deallocation method that frees the directory string. Deallocation of views will be handled by placing them in an autorelease pool upon creation, so that they are freed when they are removed from the view hierarchy or when the window is closed.
We discuss the implementation of the controller before the implementation of the browse window. There are two accessors that retrieve the location of the find and grep commands, which was set on startup.
We now discuss the very important method that runs tasks and collects their output and error streams. It returns two arrays of lines by reference in the variable rptr. The first step is to create a task object and set its launch path and its arguments. (The call to NSLog can aid in debugging.)
We need a pipe to the output and error streams, so we create the appropriate objects. We retrieve two file handles for reading from the two pipes.
We are now almost ready to lauch the task. We connect the two pipes to the two streams and declare the variables that will hold the data that they produce.
The actual run of the task takes place inside an exception handler. We try to run the task and read the data from the two output streams. We wait until the task exits and close the file descriptors that we used to read data from the task.
The error handler terminates the task if something went wrong and the task is still running. It sets the two arrays of output and error lines to contain the reason for the exception and returns -1.
What remains will only be executed if we were successful in launching the task and reading from its two data streams. We check the two data objects for the presence of data. If there are data, then we convert them into a C string and split this string into lines. We do not include the last return character because it would produce an empty string at the end of the array.
The method returns the exit status of the task, which is an important value that can tell us whether the task succeeded or not and what problems there were, if any.
This almost completes the implementation of the controller. The penultimate method is invoked when the application finishes launching and it reads the location of the two binaries for finding and grepping from the user defaults. It requests the defaults object and the file manager for this purpose.
It looks for the find binary in the defaults and sets it to a default value if there was no entry. It raises an exception if the binary is not executable.
The grep binary is handled the same way: look for it among the defaults, assign a default value if it is not found, and check that it is executable.
The last method responds to an entry on the application’s main menu and lets the user choose the directory that she wants to search. It runs a pretty standard openpanel dialogue to get this value. It obtains the openpanel and sets the title. The user may choose directories but not files and no multiple selections are allowed.
Should there be an entry for the key “Directory” among the user defaults, then we try to open this directory. (The value for this key is updated after successful queries.) We use the current directory if there was no entry.
We only set the openpanel’s directory if it is indeed a directory of the file system.
The last step is to run the panel. We create a new browse window for the chosen directory if the user clicked the okay button. We center the window on the screen and order it to the front.
We may now discuss the implementation of the browse window. The initializer is first. It is very simple conceptually, since it only needs to assemble the views that go into the content view of the window. The first step is to store and retain the directory. The controller is also stored but does not need to be retained. Next we declare and set the frame of the new window and its style mask, making it closable, titled, and resizable.
We set the minimum size of the window and its title, which contains the name of the directory being searched.
We must assemble the subviews that make up the interface. Start with the two forms in the upper third. We declare the frame for the forms and initialize it to be the right width; the height and the origin are set later. We declare arg to iterate over the form entries (the parameters) that we declared earlier. The variable f will be used later to iterate over the two forms, left and right, that is.
We iterate over the arguments that we require and place them in the two forms, reading the statically declared titles and choosing the left form for the first half and the right form for the second.
It remains to customize the forms and their cells to serve our needs. We ask the first cell for the size of the font that it uses and obtain a fixed pitch font of this size.
Now we iterate over the two forms, setting the label and the content font first.
The titles are supposed to be right justified. A single cell is as wide as half the window and ten points higher than its font.
We move each form to the right location. The origin of the left form is at the window’s border and the right form is next to the left. The upper boundary of the form coincides with the upper boundary of the window.
The forms are widthsizable. The left form is attached to the left border of the window, and the right form to the right.
The forms are made subviews of the window’s content view and put in an autorelease pool.
We connect the left form to the right so that the user may tab through the fields.
The search button lies below the two forms and is as wide as the window. We initialize its frame rectangle and allocate the button.
The title of the button is “search” and it gets the same font as the formcells. The window is its target and the action that it triggers when clicked is search:.
We wish to obtain the appropriate height of the button and move it to its place below the two forms. Hence we ask it for the appropriate size, of which we use the value for the height. The origin is below the two forms. We set the button’s frame.
The button is widthsizable and remains attached to the upper boundary of the window.
The button becomes the key view should the user tab out of the last field of the right form. Tabbing again takes the user back to the first field of the left form, which is the initial first responder of the window.
We attach the button to the window’s content view.
The lower part of the window is occupied by a split view, which we now create. This view will hold two scrollviews and occupies the space left over after the forms and the button have been taken into account. Its delegate is the window itself, which implements size restraint messages during resize operations.
The splitview is widthsizable and heightsizable. It expands to fill the entire lower part of the window. We attach it to the view hierarchy.
It remains to create the two scrollviews. Both are half as high as the split view and as wide as the window. The upper scrollview is placed in the upper half of the splitview.
It should resize with the splitview, has a vertical scroller and its background is white. We place it in the splitview.
The lower scrollview has the same dimensions as the upper one, but is placed in the lower left corner of the splitview.
Its sizing behavior is the same as that of the upper scrollview, as are its choice of scrollers and its background color. It is also placed in the view hierarchy.
All of these views go into an autorelease pool, so that they are freed upon removal from the view hierarchy and upon closure of the window.
We are now done assembling the window and read the arguments of the last successful query, if any, into the formcells of the two forms.
We select the text in the first cell so that the user can immediately start typing when the window appears.
We’ll be using a fixed pitch font for the textviews that display matches. We now obtain a font of the right size and the dimensions of its bounding box for later use. This ends the initializer.
New document views of the two scrollviews need to be autoreleased so that they are later freed at the appropriate times. We define two methods that send an autorelease message to a view before placing it in the upper or lower scrollview.
The process of reading the search parameters from the user defaults is very simple. First obtain the defaults object, then retrieve the values by their titles and write them into the appropriate form cell.
The search parameters are written to the user defaults after every successful search. This is done by first retrieving the user defaults.
Then we iterate over the cells that hold the parameters, checking their values in turn. Values that are not empty are recorded in the user defaults.
We also record the current directory. The write process ends with an invocation of synchronize, so that the parameter values will be used the next time a new window is opened.
Recall that the browse window is the splitview’s delegate. We implement two methods that the split view will invoke when the divider is moved; one to constrain the minumum coordinate and one the maximum coordinate. We set these to be a sixth of the splitview’s height away from its lower and upper boundary respectively. This keeps the user from moving the divider to a position where the upper or lower scrollview can no longer be scrolled in a useful way.
The method processMatchFor:data: plays a key role in the application. It is invoked when the grep was successful and the file contained lines that match the pattern. Here is an example of the output from the grep command.
grep -n -C 1 dealloc tasks/browseit/code/browseit.m
847- [directory release];
848: [super dealloc];
The input of this method is the name of the file and the output lines. A line either contains match and context data or it is a divider between sections. We skip those dividers. We read each line in turn and produce a button and a textview for the section whenever the line number of the current line is not one more than the previous line number, which is how we recognize sections. We start by getting the document view of the upper scrollview. We’ll attach sections to this view. This method merely attaches sections at the bottom edge of the previous section, which is stored in the instance variable attachAtY. The caller is responsible for resizing the document view once all sections have been attached. We declare variables for iterating over the output lines. Actually our loop will include an extra iteration (a divider) at the very end, which is how we process the last section. We will store the range of the current section, i.e. where it begins and how many lines it contains. We store the lines of each section in a mutable data object. We will also compute the length of the longest line (maxlen) of each section so that we may choose the appropriate width of the textview.
We iterate over the lines including an extra iteration at the end, where we use a divider as the line’s content. We extract the current line, replace tabs by spaces and convert it into a C string.
Recall that the line number starts the line. We extract its value and skip over the digits by which it is represented. If there were no digits, then we have a divider, and we skip all dividers except for the last one.
The actual content of the line is stored in the variable rest. Note that we must skip over the colon that marks matches and the dash that marks context. We have a new section if we are at the very end or if there was a skip in the line number.
First we create the textview that holds the match and its context. We compute its size from the longest line (width) and the total number of lines in this section (height). We allocate it and initialize it with the right frame. We will set the origin later. It should go below the previously attached views.
We set it not to be editable, to use the fixed pitch font, and to contain the current set of lines and put it into an autorelease pool for automatic deallocation upon removal from the view hierarchy or a window closure.
Next we create the button that will take the user to the entire file if clicked. It is situated above the textview. We will adjust the dimensions and the origin of the button later.
The button’s title lists the name of the file and the line number. The range that it represents is the current range. It is a momentary push button.
The window i.e. self is the target of the button and its action is the method that loads a file into the lower scrollview. The button represents the current file, which is relative to the directory that is stored in the corresponding instance variable.
We are just about done with the button and ask it to set its frame to its preferred size. The button also goes into an autorelease pool.
The instance variable maxwidth stores the maximum width of the buttons and textviews created during the current find process. It must be updated if the new button or textview is wider than its value. We build a chain of views moving downwards along the y-axis of the document view and attach the new sections at the bottom of the chain.
We attach the button and the textview at the bottom of the chain. We also move the attachement point down by the combined height of the new objects. This completes the construction of a new section.
Recall that we store state as we iterate over the output lines, namely the contents of the current section and its range. We must reset these values at the beginning and after a section has been completed. These are the two cases that the following if statement detects. We reset the section line array to the current line, set the range to be a one-line range at the current line and the maximum number of characters to be those of the current line.
There is also housekeeping to do when we are inside a section. We add the current line to the array of lines, increase the length of the range by one and update the maximum width in characters if necessary. This concludes the processing method, which is invoked with the grep output for every file that matches the name or regular expression given in the form.
We have seen that the buttons for each section have the browse window as the target. A click on one of these buttons should load the entire file that the section excerpts and scroll to the excerpt. The method loadIt: is the action of these buttons and implements the desired load behavior. It starts by computing the full path to the file and attempts to load it into memory (into a string object).
A descriptive error message will be displayed in the lower scrollview if the file cannot be loaded. We replace all tabs by spaces if the file did load correctly.
We now prepare to iterate over the characters in the file in order to compute several statistics. We need to transform the range that is to be selected from a range of lines into a range of characters. We also need to know the widest line of the file so that we may choose the right size of textview. The variable ptr is used to iterate over the string and prev holds the offset in characters of the previous line. The variables from and to describe the extent of the selection.
Start iterating over the characters. We have a line feed if we see a newline character or if we are at the end of the string, in which case the length of the line should include the last character.
If the length of the current line is larger than the recorded maximum, then the maximum is updated.
We record the character offset if we have found the first line. It starts one character after the previous newline. Similarly, the length of the range in characters is the difference between the current offset and the start of the range. The current newline becomes the previous line now that we are done.
Characters in the interior of a line merely increase the length of the line. We move to the next character in all cases.
We may create the textview once all relevant statistics have been computed. Its dimensions are determined by the length of the widest line and the total number of lines. We allocate the view and initialize it with the right frame size.
The view is not editable and uses the fixed pitch font that is also used in the section view. Its contents are the contents of the file with tabs replaced by spaces. We select the lines that were shown in the section above and place the new textview in the lower scrollview.
We now compute the rectangle that should be visible in the scrollview. It is given by the upper part of the selected range. We could have used scrollRangeToVisible: for this purpose, but its current implementation only shows the first line, and we wish to show as many lines as possible. We compute the visible height of the content view of the lower scrollview, minus one line. (This really is the content view and not the document view.) We compute the rectangle vrect, which encloses the selection. If the rectangle is higher than what can be displayed, then its height is set to the maximum possible value. Finally we ask the textview to scroll the rectangle into the visible part of the scrollview. This completes the method that loads files in response to button clicks on range buttons.
We have already encountered the method errView: on several occasions. Recall that it manufactures a textview containing an error message for placement in the upper or lower scrollview. This method is kept simple and can certainly be improved. We make a textview whose frame has the default width in use throughout the application and is half as high as this default. We choose an eighteen-point system font for our messages.
Error messages are of course not editable and we record this fact. The background color of the view is white and it uses the font mentioned above.
We prefix the error message with the current date and time. This is so that the message changes even if the user repeatedly clicked the search button or a section button. The change in the date shows the user that processing has occurred.
We store the error message with the prefix in the textview and return the result.
The penultimate method of a browse window and of this program is the action method search:. It invokes find and runs grep on all matching files, using processMatchFor:data: to assemble the button-textview pairs that go into the upper scrollview. Start by constructing the arguments to find. We indicate that find should follow symbolic links and list only files and not directories.
We read the values for the name and the regular expression from the form cells. (Note that the name applies to the name of the file and the regular expression applies to the whole path.) We add the name pattern if there was one. Similarly, if the user entered a regular expression, then it is added to the find arguments.
We must specify an action for find to take on matching files. We choose the action printf with the directive P, which prints the file’s “name with the name of the command line argument under which it was found removed,” so that the user sees path components starting with the components in the directory that is being searched.
We are now ready to invoke find. We ask the controller to run the find program and collect its standard output and standard error streams in two arrays. The array files contains the standard output.
There are two possible error conditions. The first occurs if find exited with a non-zero exit code, in which case we collect its standard error for display with errView:. It is also an error if find did not find any files. The error message goes into the upper scrollview in either of these cases.
We sort the files if there was no error.
We now prepare the arguments to grep. Only the last argument (the file being searched) will change as we iterate over the files; all the other arguments stay constant. The first argument is absolutely essential: it tells grep to include line numbers in its output, which we use to parse it into sections. If the user entered a context value (number of lines surrounding a match that should be displayed), then this value is used.
We also have a form cell for additional switches that can be passed on to grep, like the switch -i for case-insenstive matching or -E for extended regular expressions (very useful). We read the value from the cell and split it on the space character; all switches that are not empty are then added to the arguments for grep.
We create the document view of the upper scrollview before we start the loop. Its dimensions will be set later, once all sections have been computed. We allocate and initialize it and put it into the scrollview. The section buttons and textviews will be attached to this view.
The last argument to grep that does not change is the pattern.
Recall that grep signals when it finds a match in a binary file. We collect these matches into an array that we display for the user when there were no matches in text files.
We may now start iterating over the file names. The variable fpos indicates the current position and the variable fmax the number of files. We set attachAtY to be the bottom margin of the document view and the maximum width of a section to be zero. Recall that we must lock focus on the search button because we will use it as a progress indicator, filling it with blue starting at the left and moving to the right as we process files.
The first thing we do inside the loop is to draw the progress indicator. The ratio fpos/fmax indicates what width to fill. We draw a rectangle and flush the window so that the user sees the indicator advance. We create an autorelease pool that will hold objects created during processing, so that we can immediately release those that are no longer referenced after an item has been processed, as opposed to releasing them in the application’s run loop.
The last argument to grep, which is not constant, is the full path to te file being searched. We ask the directory string to provide the path that is obtained by appending the relative path to the file.
We run grep with the array of arguments and extract the contents of the standard output stream.
The exit status of grep is 2 if an error occurred. If this is the case, then we build an error message that contains the name of the file and the error message from grep, or rather, the contents of its standard error stream. This error message is placed in the upper scrollview. We must unlock focus from the search button and restore it to its former state before we exit the routine.
The exit status of grep is zero if there was a match. There are two possibilities: there was a text file match or a match in a binary file. The former includes match and context data that are prefixed with line numbers; the latter, a message that there was a match. Hence we may distinguish the two cases by checking if the first line begins with a digit. If so, we process the output into sections, assembling buttons and textviews with processMatchFor:data:. If not, the file is added to the array of binaries.
The loop ends with the removal of the file name from the arguments; i.e. we pop the name that we pushed at the end of the array at the beginning of the loop. We also release memory that was allocated during processing of the current file and is no longer used. We unlock focus and restore the search button’s appearance once we are done processing all the files.
It is quite possible that there were no matches. We must inform the user of this event. We produce an error message to this effect. It includes the list of binary matches if there were any and goes into the upper scrollview.
There is some clean-up work to do if we did find matches. We now know the required total size of the document view and resize it accordingly. We tell the document view to scroll to the first match, so that the user sees the first file that matched in the upper left corner of the scrollview’s content view.
The search action writes the search arguments into the user defaults database if the query was successful.
A browse window has a method for deallocation, which frees the space occupied by the directory string. Note that the memory from the buttons and textviews should be reclaimed automatically upon removal from the view hierarchy or window closure, since they were placed in an autorelease pool just after being created.
The main function has the usual format. It instantiates an autorelease pool, the controller and the application object.
The application’s menu includes an entry that lets the user open a directory with a browse window for searching. There is an entry for the action copy:, so that the user may copy text from the section textviews or the file textview into another application. The item “Quit” as usual quits the program.
It remains to set the application’s delegate and run the application. Hopefully this little program illustrates successfully how a productivity tool may be implemented in a program of a few hundred lines.
Making the number of spaces that replace a tab customizable is left as an exercise.