Monday, November 22, 2010

Inexpressible in C

Recently at work I encountered something I'd never previously seen in the C language: an inexpressible pattern.

(To note: I'm using the term “pattern” here to denote a design at an abstract, conceptual level, like where someone may say, “Just code the state machine in a switch block within a while loop.” In such a case, implementing a state machine where each state comprises its own unique case block within the switch constitutes a pattern.)

What I ran into at work was a pattern—something I wanted to implement—but couldn't because the language won't allow it. I don't remember this ever happening in C. Sure, there are countless patterns that are ridiculously awkward when expressed in C—for example, object hierarchies—but this pattern I wanted to implement cannot be expressed at all, as far as I can figure out.

What pattern is this? Succinctly, it's the pattern of “a function that returns a pointer to its own type.” This may sound simple enough, but it can't be done—not directly, not exactly, at least—in C. A little code may illustrate best what I'm writing about.

typedef void * (foo_t)(void);

foo_t *bar(void);

Above is a function, bar, that almost returns a pointer to its own type. What it returns is a pointer to a function that returns a pointer to any type, which presumably could be the address of bar. Of course, if I were tolerant of void pointers being tossed about, I could have made things much simpler, like so:

void *bar(void);

Here, bar could return its own address or the address of any other function with the same type or any address at all. But what I wanted to express was a function type, foo_t, that returns a specifically typed pointer to its own type. And that can't be done (as far as I can figure out) because one must replace the void * in the typedef declaration with a infinite regression of cascading types. Function types, in C, cannot return themselves.

I find this interesting for two reasons. The first is that this is exactly the sort of self-referential hole that shows up seemingly everywhere when you're dealing with complex systems. Douglas Hofstadter would be amused. The second reason is that I wonder why I'd never previously tried to implement this pattern. What's unique about my current project that led me to wanting this pattern for the first time?

The most obvious answer is that this is my first real small-system embedded project. The device has no operating system, so my main function really is all there is in the device's little software universe—excluding startup code, of course. What I wanted to use the function-returning-its-own-type pattern for was for implementing different modes that the device can switch into. For example, among other modes, the device has a normal operating mode and a diagnostic mode, and so each mode naturally calls for its own sort of “main” function, like so:

void main_diagnostic_mode(void);
void main_normal_mode(void);

Each modes' main function would then be called from the actual main function whenever appropriate to do so.

However, each mode can end and cause another mode to begin. For example, a user command in normal mode may cause the diagnostic mode to begin. In all cases, which mode should begin next is determined by the logic of the current mode. This suggests the following prototypes:

mode_main_fn_t *main_diagnostic_mode(void);
mode_main_fn_t *main_normal_mode(void);

Here, mode_main_fn_t * is a pointer type that can address main_diagnostic_mode, main_normal_mode, or any of the other modes' main functions. Thus, each mode's main function runs for some indefinite duration and, upon returning, elegantly notifies the main function which mode to run next.

int main(void) {
    mode_main_fn_t *mode = main_normal_mode; /* initial mode */
    while (1) {
        mode = mode();

But, as I described above, this pattern cannot be expressed in the C language unless one defines mode_main_fn_t to be void, which is like cheating. So are typecasts cheating, too.

There are, of course, many solutions to implement for this problem in general. The pattern that I wanted to implement is only one possible solution. That I went about solving the problem a different way doesn't make inexpressibility in C any less interesting.


Ted said...

I'd be personally interested to hear more about this embedded project of yours. I've dealt with embedded programming before on various scales but I've never just skipped the OS before. Closest I've gotten was dealing with ThreadX, which if you don't already know is a single process, multi-threaded OS.

I must say, I'm not surprised you can up with a use for this pattern. You always have had a knack for elegant design.

cmbrandenburg said...