OTP: The Fun and Frustration of C

Patrick Reagan, Former Development Director

Article Category: #Code

Posted on

When David shared the one-time pad challenge with the team, I was eager to dive back into some C. In 2015, high-level languages prevail, so a low-level C implementation may be seen by some as work that is unnecessarily difficult, unnecessarily tedious, and, well, unnecessary. Since the exercise was presented as an entertaining diversion, I rejected these reasons and focused on the work that fluctuated between contemplation and frustration. Questions like "Do I have enough memory for that?" and "Is there a risk of overflow?" gave way to frustration when, yes, I inevitably forgot to increment a loop counter.

Despite this potential for frustration, C's storage of a character as a single byte eases the act of converting between a letter and its numeric equivalent. Treating the letter as a character will show the character. Treating the letter as a number will show the number; the conversion is implicit:

 printf("The value of '%c' is '%d'.\n", 'a', 'a');
=> "The value of 'a' is '97'."

printf("The value of '%c' is '%d'.\n", 97, 97);
=> "The value of 'a' is '97'."

This representation is helpful in two other scenarios. Checking the inclusion of a character in a given range (though there is a better option, see is_hex_digit below):

x = 'b';
printf("'a' <= '%c' <= 'f': %d\n", x, (x <= 'f' && x >= 'a') ? 1 : 0);
=> "'a' <= 'b' <= 'f': 1"

x = 'g';
printf("'a' <= '%c' <= 'f': %d\n", x, (x <= 'f' && x >= 'a') ? 1 : 0);
=> "'a' <= 'g' <= 'f': 0"

Converting a character to its hexadecimal equivalent:

printf("Value of '%c', is '%02x'.\n", 'a', 'a');
=> "Value of 'a', is '61'."

I started with these building blocks to write some helpful utility functions and then tackled some of the harder problems that C presented. Let's step through some highlights.

Validation and Conversion

Here are three examples where I was able to apply the previously-mentioned techniques:

is_hex_digit

For a single hexadecimal digit, this checks to see if it is within the space of hexadecimal characters. The test is case-insensitive, so I check both the range a-f and A-F:

 short is_hex_digit(char digit)
{
    short is_hex_digit = 0;

    is_hex_digit = is_hex_digit || (digit >= '0' && digit <= '9');
    is_hex_digit = is_hex_digit || (digit >= 'A' && digit <= 'F');
    is_hex_digit = is_hex_digit || (digit >= 'a' && digit <= 'f');

    return is_hex_digit;
}

Update 2/19/15: As mat points out in the comments, using isxdigit is a better option. I updated the code to take advantage of this function.

hex_digit_to_decimal

Again, given a single hexadecimal digit, this converts the character to its decimal equivalent:

int hex_digit_to_decimal(char hex)
{
    char subtrahend = '0';

    if (hex >= 'a') {
        subtrahend = HEX_LOWER_OFFSET;
    } else if (hex >= 'A') {
        subtrahend = HEX_UPPER_OFFSET;
    }

    return hex - subtrahend;
}

I define the constant HEX_LOWER_OFFSET as ('a' - 10) and the constant HEX_UPPER_OFFSET as ('A' - 10). When substituting these values in the above function the result is:

'a' - ('a' - 10) = 97 - (97 - 10) = 97 - (87) = 10
'b' - ('a' - 10) = 98 - (97 - 10) = 98 - (87) = 11
...
These values correspond to the hexadecimal values for a, b, and through to f. The same holds true for the upper-cased versions.

decimal_to_hex

This should look familiar as it's a slight variation on the simple example I introduced earlier. Rather than printing out the hexadecimal representation of a decimal digit, this function will return it as a string:

char *decimal_to_hex(int number)
{
    char *hex = calloc(3, sizeof(char));
    snprintf(hex, 3, "%02x", number);

    return hex;
}

Using the Key

As David mentioned, fetching a byte from the key needs to wrap around if there aren't enough characters to encode the message. It's not a hard problem, but my approach differs from what I would do in an object-oriented language:

next_key_byte

Given the user-supplied key, this will pull off two characters at a time, resetting the index if its value exceeds the length of the key:

int next_key_byte(char *key, int key_length)
{
    char buf[2];
    static int key_index = 0;
    int i = 0;

    for (i = 0; i < sizeof(buf); i++) {
        if (key_index > (key_length - 1)) { key_index = 0; }

        buf[i] = key[key_index];

        key_index++;
    }

    return hex_to_decimal(buf);
}

The static key_index variable keeps track of position between invocations, so each call to this function will return the next byte. Instead of a static variable I could have used an argument passed by reference, but thread safety isn't a concern for this program.

Reading from STDIN

This is where my language choice presented the biggest challenge. Reading a line of input terminated by a newline can be easily, if unsafely, performed by using gets, but the process is more challenging when your input is newline-deficient and you want to perform the read safely:

read_from

Given an input stream, this reads the contents into a growable buffer until it reaches the end of the stream. Because the input length isn't known at compile time, I allocate a buffer of BUF_SIZE length (defined as 32), read in that number of characters, and save it into an appropriately-sized input buffer. If there is more data to fetch, it returns to the stream to fetch the additional characters, resizes the buffer, and appends the new data to the input buffer. Once all the characters in the stream have been retrieved, the function will return a pointer to the full contents:

char *read_from(FILE *stream)
{
    char *input  = NULL, *tmp = NULL;
    char *buffer = calloc(BUF_SIZE, sizeof(char));

    int buffer_index = 0,
        iteration    = 0;

    size_t input_index = 0,
           chars_read  = 0;

    while (1) {
        chars_read = fread(buffer, sizeof(char), BUF_SIZE - 1, stdin);
        tmp        = realloc(input, (BUF_SIZE * iteration) + (chars_read + 1));

        if (tmp == NULL) {
            return input;
        } else {
            input = tmp;
        }

        for (buffer_index = 0; buffer_index < chars_read; buffer_index++) {
            input[input_index] = buffer[buffer_index];
            input_index++;
        }

        input[input_index] = 0;

        if (feof(stdin)) { break; }

        iteration++;
    }

    free(buffer);

    return input;
}

Working in a high-level language insulates you from the concerns of byte-level manipulations, but it's good to return to this world to recognize what work the machine performs and to appreciate that the majority of your time is spent not recognizing it. Different perspectives can improve your skill as a programmer; quite often, we have no idea when we will draw from these experiences, but having more to draw from increases our chance for serendipitous inspiration.

Other Implementations

This was my first post in the one-time pad series. I wrote about C not because it was the first language I used to solve this problem, but because I think it will provide good contrast when I write about my other experiences. In the coming weeks, I will share highlights from my solutions written in D and Julia.

Related Articles