by Marko Riedel
We provide a sample implementation of a basic steganography algorithm, by which hidden data are placed in a TIFF image, which is assumed to contain RGB values in either a planar or a meshed configuration. Think of the input data as a stream of bits. We place these bits in the image by altering the least significant bit of the red color component. The actual location of the data in the image is determined by a passphrase. This assures that even if someone can determine which pixels have been altered, she will not know in what order the altered pixels should be read.
The first task is to construct a permutation of the pixel locations from the passphrase. We may think of the passphrase as an infinite sequence of bits (repeat the passphrase to get additional bits). We start by implementing a class that delivers chunks of bits from this source.
The class BitSource stores the passphrase, which we restrict to ASCII characters in the range from 32 to 127, so that every character provides seven bits. The total number of bits is seven times the length of the passphrase and repeats thereafter. Bitsource stores the position in the bit stream, the length of the phrase and the total number of bits. The method bitString:count: is used for debugging purposes. It converts some number of bits at the least significant end of an unsigned long integer into a string.
We extract the bits one after the other and prepend them to the result string (we prepend because the less significant bits come first).
The method initWithString: initializes a bit source from a passphrase. It checks that the phrase is not empty, allocates a buffer for the phrase, checks that no characters are out of range and copies the characters into the buffer. It initializes the postion to be at the beginning of the string and computes the total number of bits that the phrase can provide before it repeats.
The method getBits: does the actual work. It returns the requested number of bits in an unsigned long integer, scanning the phrase from left to right and wrapping at the end of the phrase.
The method computes what character corresponds to the current position and what bit to extract. It starts with the highest bit because the phrase is scanned from left to right. It writes the bit into the result integer and moves to the next position. It starts over if it reaches the end of the phrase. The method dealloc frees the buffer where the phrase was stored.
We can now move on to the actual algorithm. There are several steps.
We start with several blocks of declarations: the first block lets us access the command line arguments; the second, the file manager, which we need to check whether the image is readable; the third, various strings, one for each command line argument; the fourth, the bit source; the fifth, the image that we’ll manipulate, the bitmap representation that wraps the image data, the data themselves, a flag to tell whether the image is planar or meshed, and how many samples there are per pixel; the sixth, the permutation of the pixel locations, how many bits we’ll need for each item in the permutation, and the dimensions of the image; the seventh and last block, a buffer for accessing individual bytes of the long integer that holds the message length and various integers that store state while we process the image.
Processing command line arguments is the easy part. First check that there is the right number of arguments; then check the value of the encode-decode switch.
Initialize the bit source from the passphrase:
Get the file name and check that the file is readable, and construct the name of the output file, which we obtain by placing the string -stg between the extension TIFF and the rest of the filename.
Now open the image, extract the image dimensions and compute the total number of pixels. Compute the integer logarithm of this number. It tells us how many bits we need to request from the bit source when we construct the permutation of pixel locations.
The next step is to actually construct the permutation. We use the idea from The Art of Computer Programming, also discussed elsewhere in this document. Start with a fully ordered permutation and move an item it to the end; decrease the end marker, repeat. The choice of the item is not random in our case. It is determined by reading the required number of bits from the bit source. The item that we move to the end is given by the remainder of the bits modulo the length from the start to the current end marker.
The last step before the actual computation is to gain access to the data planes where the image is stored. The red data plane comes first in a planar configuration. We obtain the data by looking for a bitmap representation among the representations of the image and exit if no such representation is found. Otherwise we ask the representation for the data, and set the planarity flag and the number of samples per pixel.
We now discuss the encode/decode process. The first step in the encode process is to compute the number of bytes the image can hold, read the data from the standard input and check that they fit into the image. We need an extra sizeof(unsigned long) bytes to store the length of the message.
The actual write process consists of two steps: first write the message length, then write the message. The variable dlbytes provides access to the individual bytes of the message length value; dbytes, to the bytes of the bitmap data plane. We iterate over a range from negative the number of bytes that hold the message length, to the length of the message. First extract the byte that is about to be written. Then write all eight bits, one after the other. The permutation tells us where in the image the bit goes. We write the bit by first zeroing the target bit and then setting the target bit to the bit that we want to record. Note that there is a fifty percent chance in a diverse image that the target bit already agrees with the source bit and no change will be detected. We must multiply the index given by the permutation by the number of samples per pixel if the image is not planar (meshed). We write the most significant bit first to facilitate later reads.