The pointers we have looked at so far have all been pointers to various types of data objects, but it is also possible to have pointers to functions. Pointers to functions are useful for approximately the same reasons as pointers to data: when you want an extra level of indirection, when you'd like the same piece of code to call different functions depending on circumstances.
24.1 Declaring, Assigning, and Using Function Pointers
24.2 What are Function Pointers Good For?
24.3 Function Pointers and Prototypes
Read sequentially: prev next up
Just as for data pointers, we can think of three steps involved in using function pointers. First, we must declare a variable which can hold a pointer to a function, and this ends up being a somewhat complex declaration. A simple function pointer declaration looks like this:
int (*pfi)();
This declares pfi as a pointer to a function which will return an int. As in other declarations, the * indicates that a pointer is involved, and the parentheses () indicate that a function is involved. But what about the extra parentheses around (*pfi)? They're needed because there are precedence relationships in declarations just as there are in expressions, and when the default precedence doesn't give you what you want, you have to override it with explicit parentheses. In declarations, the () indicating functions and the [] indicating arrays "bind" more tightly than the *'s indicating pointers. Without the extra parentheses, the declaration above would look like
int *pfi(); /* WRONG, for pointer-to-function */
and this would declare a function returning a pointer to int. With the explicit parentheses, however, int (*pfi)() tells us that pfi is a pointer first, and that what it's a pointer to is a function, and what that function returns is an int.
It's common to use typedefs (see section 18.1.6) with complicated types such as function pointers. For example, after defining
typedef int (*funcptr)();
the identifier funcptr is now a synonym for the type "pointer to function returning int". This typedef would make declaring pointers such as pfi considerably easier:
funcptr pfi;
(In section 18.1.6, we mentioned that typedefs were a little bit like preprocessor define directives, but better. Here we see another reason: there's no way we could define a preprocessor macro which would expand to a correct function pointer declaration, but the funcptr type we just defined using typedef will work just fine.)
Once declared, a function pointer can of course be set to point to some function. If we declare some functions:
extern int f1(); extern int f2(); extern int f3();
then we can set our pointer pfi to point to one of them:
pfi = &f1;
or to one or another of them depending on some condition:
if(condition) pfi = &f2; else pfi = &f3;
(Of course, we're not restricted to these two forms; we can assign function pointers under any circumstances we wish. The second example could be rendered more compactly using the conditional operator: pfi = condition ? &f2 : &f3 .)
In these examples, we've used the & operator as we always have, to generate a pointer. However, when generating pointers to functions, the & is optional, because when you mention the name of a function but are not calling it, there's nothing else you could possibly be trying to do except generate a pointer to it. So, most programmers write
pfi = f1;
or
if(condition) pfi = f2; else pfi = f3;
(or, equivalently, using the conditional operator, pfi = condition ? f2 : f3 ).
(The fact that a function pointer is generated automatically when a function appears in an expression but is not being called is very similar to, and in fact related to, the fact that a pointer to the first element of an array is generated automatically when an array appears in an expression.)
Finally, once we have a function pointer variable which does point to a function, we can call the function that it points to. Broken down to a near-microscopic level, this, too, is a three-step procedure. First, we write the name of the function pointer variable:
pfi
This is a pointer to a function. Then, we put the * operator in front, to "take the contents of the pointer":
*pfi
Now we have a function. Finally, we append an argument list in parentheses, along with an extra set of parentheses to get the precedence right, and we have a function call:
(*pfi)(arg1, arg2)
The extra parentheses are needed here for almost exactly the same reason as they were in the declaration of pfi. Without them, we'd have
*pfi(arg1, arg2) /* WRONG, for pointer-to-function */
and this would say, "call the function pfi (which had better return a pointer), passing it the arguments arg1 and arg2, and take the contents of the pointer it returns." However, what we want to do is take the contents of pfi (which is a pointer to a function) and call the pointed-to function, passing it the arguments arg1 and arg2. Again, the explicit parentheses override the default precedence, arranging that we apply the * operator to pfi and then do the function call.
Just to confuse things, though, parts of the syntax are optional here as well. There's nothing you can do with a function pointer except assign it to another function pointer, compare it to another function pointer, or call the function that it points to. If you write
pfi(arg1, arg2)
it's obvious, based on the parenthesized argument list, that you're trying to call a function, so the compiler goes ahead and calls the pointed to function, just as if you'd written
(*pfi)(arg1, arg2)
When calling the function pointed to by a function pointer, the * operator (and hence the extra set of parentheses) is optional. I prefer to use the explicit *, because that's the way I learned it and it makes a bit more sense to me that way, but you'll also see code which leaves it out.
In this section we'll list several situations where function pointers can be useful.
A common problem is sorting, that is, rearranging a set of items into a desired sequence. It's especially common for the set of items to be contained in an array. Suppose you were writing a function to sort an array of strings. Roughly speaking, the algorithm for sorting the elements of an array looks like this:
In C, the outline of the function might therefore look like this:for all pairs of elements in the array if the pair is out of order swap them
stringsort(char *strings[], int nstrings)
{
for( all pairs of elements i, j in strings )
{
if(strcmp(strings[i], strings[j]) > 0)
{
char *tmp = strings[i];
strings[i] = strings[j];
strings[j] = tmp;
}
}
}
Remember, the library function strcmp compares two strings and returns either a negative number, zero, or a positive number depending on whether the first string was "less than," the same as, or "greater than" the second string. If you haven't seen it before, the series of three assignments involving the temporary variable tmp is a standard idiom for swapping two items. (The more obvious pair of assignments
strings[i] = strings[j]; strings[j] = strings[i];
would just about as obviously not work, because the first line would obliterate strings[i] before the second line had a chance to assign it to string[j].)
Naturally, a real sort function implementing a real sorting algorithm will be a bit more elaborate; in particular, the details of how we choose which pairs of elements to compare (and in what order) can make a huge difference in how efficiently the function completes its job. We'll show a complete example in a minute, but for now, our more important focus is on the comparison step. The final ordering of the strings will depend on the strcmp function's definition of what it means for one string to be "less than" or "greater than" another. How strcmp compares strings (as we saw in chapter 8) is to look at them a character at a time, based on their values in the machine's character set (which is how C always treats characters). In ASCII, the character set that most computers use, the codes representing the letters are in order, so strcmp gives you something pretty close to alphabetical order, with the significant difference that all the upper-case letters come before the lower-case letters. So a string-sorting function built around strcmp would sort the words "Zeppelin," "able," "baker," and "Charlie" into the order
strcmp is quite useless for sorting strings which consist entirely of digits, because it goes from left to right, stopping when it finds the first difference, without regard to whether it was comparing the ten's digit of one number to the hundred's of another. For example, a strcmp-based sort would sort the strings "1", "2", "3", "4", "12", "24", and "234" into the orderCharlie Zeppelin able baker
1 12 2 234 24 3 4
Depending on circumstances, we might want our string sorting function to sort into the order that strcmp uses, or into "dictionary" order (that is, with all the a's together, both upper-case and lower-case), or into numeric order. We could pass our stringsort function a flag or code telling it which comparison strategy to use, although that would mean that whenever we invented a new comparison strategy, we would have to define a new code or flag value and rewrite stringsort. But, if we observe that the final ordering depends entirely on the behavior of the comparison function, a neater implementation is if we write our stringsort function to accept a pointer to the function which we want it to use to compare each pair of strings. It will go through its usual routine of comparing and exchanging, but whenever it makes the comparisons, the actual function it calls will be the function pointed to by the function pointer we hand it. Making it sort in a different order, according to a different comparison strategy, will then not require rewriting stringsort at all, but instead will just involve passing it a pointer to a different comparison function (after perhaps writing that function).
Here is a complete implementation of stringsort, which also accepts a pointer to the string comparison function:
void stringsort(char *strings[], int nstrings, int (*cmpfunc)())
{
int i, j;
int didswap;
do {
didswap = 0;
for(i = 0; i < nstrings - 1; i++)
{
j = i + 1;
if((*cmpfunc)(strings[i], strings[j]) > 0)
{
char *tmp = strings[i];
strings[i] = strings[j];
strings[j] = tmp;
didswap = 1;
}
}
} while(didswap);
}
(This code uses a fairly simpleminded sorting algorithm. It repeatedly compares adjacent pairs of elements, keeping track of whether it had to exchange any. If it makes it all the way through the array without exchanging any, it's done; otherwise, it has at least made progress, and it goes back for another pass. This is not the world's best algorithm; in fact it's not far from the infamous "bubble sort." Although our focus here is on function pointers, not sorting, in a minute we'll take time out and look at a simple improvement to this algorithm which does make it a decent one.)
Now, if we have an array of strings
char *array1[] = {"Zeppelin", "able", "baker", "Charlie"};
we can sort it into strcmp order by calling
stringsort(array1, 4, strcmp);
Notice that in this call, we are not calling strcmp immediately; we are generating a pointer to strcmp, and passing that pointer as the third argument in our call to stringsort.
If we wanted to sort these words in "dictionary" order, we could write a version of strcmp which ignores case when comparing letters:
#include <ctype.h>
int dictstrcmp(char *str1, char *str2)
{
while(1)
{
char c1 = *str1++;
char c2 = *str2++;
if(isupper(c1))
c1 = tolower(c1);
if(isupper(c2))
c2 = tolower(c2);
if(c1 != c2)
return c1 - c2;
if(c1 == '\0')
return 0;
}
}
(The functions isupper and tolower are both from the standard library and are declared in <ctype.h>. isupper returns "true" if its argument is an upper-case letter, and tolower converts its argument to a lower-case letter.)
With dictstrcmp in hand, we can sort our array a different way:
stringsort(array1, 4, dictstrcmp);
(Some C libraries contain case-independent versions of strcmp called stricmp or strcasecmp. Both of these would do the same thing as our dictstrcmp, although neither of them is standard.)
We can also write another string-comparison function which treats the strings as numbers, and compares them numerically:
int numstrcmp(char *str1, char *str2)
{
int n1 = atoi(str1);
int n2 = atoi(str2);
if(n1 < n2) return -1;
else if(n1 == n2) return 0;
else return 1;
}
Then, we can sort an array of numeric strings correctly:
char *array2[] = {"1", "234", "12", "3", "4", "24", "2"};
stringsort(array2, 7, numstrcmp);
(As an aside, you will occasionaly see code which is supposed to compare two integers and return a negative, zero, or positive result--i.e. just like the numstrcmp function above--but which does so by the seemingly simpler technique of just saying
return n1 - n2;
It turns out that this trick has a serious drawback. Suppose that n1 is 32000, and n2 is -32000. Then n1 - n2 is 64000, which is not guaranteed to fit in an int, and will overflow on some machines, producing an incorrect result. So the explicit comparison code such as numcmp used is considerably safer.)
Finally, since we've started looking at sorting functions, let's look at a simple improvement to the string sorting function we've just been using. It has been plodding along comparing adjacent elements, so when an element is far out of place, it takes many passes to percolate it to the correct position (one cell at a time). The improvement, based on the work of Donald L. Shell, is to compare pairs of more widely-separated elements at first, then proceed to compare more and more closely-situated elements, until on the last pass (or passes) we compare adjacent elements, as before. Since the earlier passes will have done more of the work (and more quickly), the later passes will just have to do the "final cleanup." Here is the improvement:
void stringsort(char *strings[], int nstrings, int (*cmpfunc)())
{
int i, j, gap;
int didswap;
for(gap = nstrings / 2; gap > 0; gap /= 2)
{
do {
didswap = 0;
for(i = 0; i < nstrings - gap; i++)
{
j = i + gap;
if((*cmpfunc)(strings[i], strings[j]) > 0)
{
char *tmp = strings[i];
strings[i] = strings[j];
strings[j] = tmp;
didswap = 1;
}
}
} while(didswap);
}
}
The inner loops are the same as before, except that where we had before always been computing j = i + 1, now we compute j = i + gap, where gap is a new variable (controlled by a third, outer loop) which starts out large (nstrings / 2), and then diminishes until it's 1. Although this code contains three nested loops instead of two, it will end up making far fewer trips through the inner loop, and so will execute faster.
The Standard C library contains a general-purpose sort function, qsort (declared in <stdlib.h>), which can sort any type of data (not just strings). It's a bit trickier to use (due to this generality), and you almost always have to write an auxiliary comparison function. For example, due to the generic way in which qsort calls your comparison function, you can't use strcmp directly even when you're sorting strings and would be satisfied with strcmp's ordering. Here is an auxiliary comparison function and the corresponding call to qsort which would behave like our earlier call to stringsort(array1, 4, strcmp) :
/* compare strings via pointers */
int pstrcmp(const void *p1, const void *p2)
{
return strcmp(*(char * const *)p1, *(char * const *)p2);
}
...
qsort(array1, 4, sizeof(char *), pstrcmp);
When you call qsort,
it calls your comparison function
with pointers to the elements of your array.
Since array1 is an array of pointers,
the comparison function ends up receiving pointers to pointers.
But qsort doesn't know that the array is an array of pointers;
it's written so that it can sort arrays of anything.
(That's why qsort's third argument
is the size of the array element.)
Since qsort doesn't know
what the type of the elements being sorted is,
it uses void pointers
to those elements
when it calls the comparison function.
(The use of void pointers here recalls their use with malloc,
where the situation is similar:
malloc returns pointers to void
because it doesn't know what type we'll use the pointers to point to.)
In the "wrapper" function pstrcmp,
we use
the explicit cast
(char * const *)
to convert the void pointers which the function receives
into pointers to (pointers to char)
so that when we use one * operator
to find out what they point to,
we get one of the character pointers (char *)
which we know our array1 array actually contains.
We pass the resulting two char *'s to strcmp
to do most of the work,
but we have to do a bit of work first to recover the correct pointers.
(The extra const
in the cast
(char * const *)
keeps the compiler from complaining,
since the pointers being passed in to the comparison function
are actually const void *,
meaning that although it may not be clear what they point to,
it's guaranteed that
we won't be using them
to modify whatever it is they point to.
We need to keep a level of const-ness in the converted pointers
so that the compiler doesn't worry
that we're going to accidentally use the converted pointers
to modify what we shouldn't.)
That was a rather elaborate first example of what function pointers can be used for! Let's move on to some others.
Suppose you wanted a program to plot equations. You would give it a range of x and y values (perhaps -10 to 10 along each axis), and ask it to plot y = f(x) over that range (e.g. for -10 <= x <= 10). How would you tell it what function to plot? By a function pointer, of course! The plot function could then step its x variable over the range you specified, calling the function pointed to by your pointer each time it needed to compute y. You might call
plot(-10, 10, -10, 10, sin)
to plot a sine wave, and
plot(-10, 10, -10, 10, sqrt)
to plot the square root function.
(If you were to try to write this plot function, your first question would be how to draw lines or otherwise do graphics in C at all, and unfortunately this is one of the questions which the C language doesn't answer. There are potentially different ways of doing graphics, with different system functions to call, for every different kind of computer, operating system, and graphics device.)
One of the simplest (and allegedly least user-friendly) styles of user interfaces is the command line interface, or CLI. The user types a command, hits the RETURN key, and the system interprets the command. Often, the first "word" on the line is the command name, and any remaining words are "arguments" or "option flags." The various shells under Unix, COMMAND.COM under MS-DOS, and the adventure game we've been writing are all examples of CLI's. If you sit down to write some code implementing a CLI, it's simple enough to read a line of text typed by the user, and simple enough to break it up into words. But how do you map the first word on the line to the code which implements that command? A straightforward, rather simpleminded way is to use a giant if/else chain:
if(strcmp(command, "agitate") == 0)
{ code for "agitate" command }
else if(strcmp(command, "blend") == 0)
{ code for "blend" command }
else if(strcmp(command, "churn") == 0)
{ code for "churn" command }
...
else fprintf(stderr, "command not found\n");
This works, but can become unwieldy. Another technique is to write several separate functions, each implementing one command, and then to build a table (typically an array of structures) associating the command name as typed by the user with the function implementing that command. In the table, the command name is a string and the function is represented by a function pointer. With this table in hand, processing the user's command becomes a relatively simple matter of searching the table for the matching command string and then calling the corresponding pointed-to function. (We'll see an example of this technique in this week's assignment.)
Our last example concerns larger, more elaborate systems which manipulate many types of data. (Here, by "types of data," I am referring to data structures used by the program, presumably implemented as structs, but in any case more elaborate than C's basic data types.) It is often the case that similar operations must be performed on dissimilar data types. The conventional way of implementing such operations is to have the code for each operation look at the data type it's operating on and adjust its behavior accordingly; in the worst case, each piece of code (for each operation) will have a long switch statement or if/else chain switching among n separate, different ways of performing the operation on n different data types. If a new data type is ever added, new cases must be added to all segments of the code implementing all of the operations.
Another way of organizing such code is to place one or more function pointers right there in the data structures describing each data type. These pointers point to functions which are streamlined and optimized for performing the operations on just one data type. (Each piece of data would obviously have its pointer(s) set to point to the function(s) for its own data type.) This idea is one of the cornerstones of Object-Oriented Programming. We could use a version of it in our adventure game, too: rather than writing new, global pieces of code implementing each new command verb we want to let the user type, and then making each of those pieces of code check all sorts of attributes to ensure that the command can't be used on inappropriate objects ("break water with cup", "light candle with bucket," etc.) we could attach special-purpose pieces of code to the individual objects themselves (via function pointers, of course) and arrange that the code would only fire up if the player were trying to use the relevant object.
It's generally a good idea to have a function prototype in scope whenever you call a function. Function prototypes allow the compiler to generate correct code for function calls, and to verify that you've called a function with the correct number and type of arguments. Standard library functions are prototyped by including the relevant standard header files, and it's generally recommended that you write prototype declarations for your own functions, too, usually placing those declarations in your own header files. But what prototype can be used for an indirect function call using a function pointer, such as
(*pfi)(arg1, arg2)
In general, it won't be known until run time what function pfi actually points to, so there's no way for the compiler to check the call against the prototype of the actually-called function. (We may know that pfi points to f1, f2, or f3, and we may have supplied prototypes for those functions, but that's immaterial.)
We've seen that when you declare a function pointer, you must declare what the return value of the pointed-to function will be. It's also possible to specify what the prototype of the pointed-to function will be. Here's our earlier declaration of pfi, beefed up with a prototype for the arguments:
int (*pfi)(int, int);
Now we know that pfi is a pointer to a function, that the function (whatever it is) accepts two int arguments, and that the function returns an int. Having specified this, the compiler will now be able to do some more checking for us. If we call
(*pfi)(1, 2, 3)
the compiler will complain, because it knows that the function pointed to by pfi is supposed to receive two arguments, but we've passed three. The compiler is also in a position to verify that we actually set pfi to point to functions that accept two int arguments. Our examples so far were in terms of functions which we declared as
extern int f1(); extern int f2(); extern int f3();
that is, as functions taking unspecified arguments and returning int. (Remember, empty parentheses in an external function declaration indicate that the function accepts unspecified arguments, while empty parentheses in a function definition indicate that the function accepts no arguments.) So the compiler won't be able to check the assignments unless we also provide prototypes:
extern int f1(int, int); extern int f2(int, int); extern int f3(int, int);
Now, when we assign
pfi = f1;
the compiler can verify that the function pointer being assigned is to a function which accepts two int arguments, as pfi expects. If, on the other hand, we declared and assigned
extern int x(int); pfi = x;
the compiler would complain, because pfi is supposed to point to a function which accepts two int arguments, and the eventual call to (*pfi)() is going to be verified to be a call passing two arguments, so assigning pfi to point to x (which accepts a single argument) is incorrect.