4.7 Representing Keys (or Other Binary Data) as English Text
4.7.1 Problem
You
want to use an easy-to-read format for displaying keys (or
fingerprints or some other interesting binary data). English would
work better than a hexadecimal representation because
people's ability to recognize the key as correct by
sight will be better.
4.7.2 Solution
Map a particular number of bits to a
dictionary of words. The
dictionary should be of such a size that an exact mapping to a number
of bits is possible. That is, the dictionary should have a number of
entries that is a power of two.
4.7.3 Discussion
The
spc_bin2words()
function shown here converts a binary string of the specified number
of bytes into a string of English words. This function takes two
arguments: str is the binary string to convert,
and len is the number of bytes to be converted.
#include <string.h>
#include <stdlib.h>
#include "wordlist.h"
#define BITS_IN_LIST 11
#define MAX_WORDLEN 4
/* len parameter is measured in bytes. Remaining bits padded with 0. */
unsigned char *spc_bin2words(const unsigned char *str, size_t len) {
short add_space = 0;
size_t i, leftbits, leftovers, scratch = 0, scratch_bits = 0;
unsigned char *p, *res;
res = (unsigned char *)malloc((len * 8 / BITS_IN_LIST + 1) * (MAX_WORDLEN + 1));
if (!res) abort( );
res[0] = 0;
for (i = 0; i < len; i++) {
leftovers = str[i];
leftbits = 8;
while (leftbits) {
if (scratch_bits + leftbits <= BITS_IN_LIST) {
scratch |= (leftovers << (BITS_IN_LIST - leftbits - scratch_bits));
scratch_bits += leftbits;
leftbits = 0;
} else {
scratch |= (leftovers >> (leftbits - (BITS_IN_LIST - scratch_bits)));
leftbits -= (BITS_IN_LIST - scratch_bits);
leftovers &= ((1 << leftbits) - 1);
scratch_bits = BITS_IN_LIST;
}
if (scratch_bits = = BITS_IN_LIST) {
p = words[scratch];
/* The strcats are a bit inefficient because they start from the front of
* the string each time. But, they're less confusing, and these strings
* should never get more than a few words long, so efficiency will
* probably never be a real concern.
*/
if (add_space) strcat(res, " ");
strcat(res, p);
scratch = scratch_bits = 0;
add_space = 1;
}
}
}
if (scratch_bits) { /* Emit the final word */
p = words[scratch];
if (add_space) strcat(res, " ");
strcat(res, p);
}
res = (unsigned char *)realloc(res, strlen(res) + 1);
if (!res) abort( ); /* realloc failed; should never happen, as size shrinks */
return res;
}
To save space, the dictionary file (wordlist.h)
is not provided here. Instead, you can find it on the
book's web site.
|
The previous code is subtly incompatible with the
S/KEY
dictionary because their dictionary is not in alphabetical order.
(S/KEY is an authentication system using one-time passwords.) Be sure
to use the right dictionary!
|
|
The code is written in such a way that you can use dictionaries of
different sizes if you wish to encode a different number of bits per
word. Currently, the dictionary encodes 11 bits of data (by having
exactly 211 words), where no word is more
than 4 characters long. The web site also provides a dictionary that
encodes 13 bits of data, where no word is more than 6 letters long.
The previous code can be modified to use the larger dictionary simply
by changing the two appropriate preprocessor definitions at the top.
The algorithm takes 11 bits of the binary string, then finds the word
that maps to the unique 11-bit value. Note that it is rare for the
number of bits represented by a single word to align exactly to a
byte. For example, if you were to encode a 2-byte binary string,
those 16 bits would be encoded by 2 words, which could represent up
to 22 bits. Therefore, there will usually be leftover bits. In the
case of 2 bytes, there are 6 leftover bits. The algorithm sets all
leftover bits to 0.
Because of this padding scheme, the output doesn't
always encode how many bytes were in the input. For example, if the
output is 6 words long, the input could have been either 7 or 8 bytes
long. Therefore, you need to manually truncate the output to the
desired length.
4.7.4 See Also
Recipe 4.8
|