All of the functions we've written so far have taken a fixed number of arguments, and we've been careful to call the functions always passing them the correct number of arguments, and we've been tending to use explicit function prototype declarations so that the compiler can verify that we call functions with the correct number of arguments. But what about printf? Sometimes we call it with one argument, but often we pass it additional arguments. Why doesn't the compiler complain? How can printf access the extra arguments we pass to it, when it can't know how many of them there will be or what types they will have, such that it can't possibly declare conventional function parameters for them?
In this chapter we'll discuss the variable-length argument lists (often discussed under the shorthand term "varargs") which allow functions such as printf to be written.
25.1 Declaring "varargs" Functions
25.2 Writing a "varargs" Function
25.3 Special Issues with Varargs Functions
Read sequentially: prev next top
The ANSI/ISO C Standard requires that all functions which accept a variable number of arguments be declared explicitly to do so, and also that a function prototype be "in scope" (that is, available) whenever a varargs function is called. You may remember that, under certain circumstances, unknown functions can be called at will, with the compiler assuming that they take "normal" arguments and return int. However, that exception does not apply here. (In other words, for varargs functions, function prototypes aren't just a good idea, they're the law.) The reason that prototypes are strictly required is that varargs functions may use special calling sequences; that is, the compiler may have to generate special code for these calls.
The presence of a variable-length argument list is indicated by an ellipsis in the prototype. For example, the prototype for printf, as found in <stdio.h>, looks something like this:
extern int printf(const char *, ...);
Those three dots ... don't mean that I left something out; they are the ellipsis notation; this is the syntax that C uses to indicate the presence of a variable-length argument list. This prototype says that printf's first argument is of type const char *, and that it takes a variable (and hence unspecified) number of additional arguments.
The ellipsis notation must follow the ordinary, fixed arguments, and there must be at least one fixed argument. It is impossible in Standard C to define a function which accepts only variable arguments. (In general, this is not too much of a restriction, because the function must always be able to determine, for itself, how many arguments there are in the list, invariably by inspecting the first argument(s) in the list. In other words, the function will generally need at least one well-defined, fixed argument anyway, to get a toehold on the problem of figuring out what the other arguments are.)
In ANSI C, the head of a varargs function looks just like its prototype. We will illustrate by writing our own, stripped-down version of printf. The bare outline of the function definition will look like this:
void myprintf(const char *fmt, ...)
{
}
printf's job, of course, is to print its format string while looking for % characters and treating them specially. So the main loop of printf will look like this:
#include <stdio.h>
void myprintf(const char *fmt, ...)
{
const char *p;
for(p = fmt; *p != '\0'; p++)
{
if(*p != '%')
putchar(*p);
else {
handle it specially
}
}
}
In this stripped-down version, we won't worry about width and precision specifiers and other modifiers; we'll always look at the very next character after the % and assume that it's the primary format character. Continuing to flesh out our outline, we get this:
#include <stdio.h>
void myprintf(const char *fmt, ...)
{
const char *p;
for(p = fmt; *p != '\0'; p++)
{
if(*p != '%')
{
putchar(*p);
continue;
}
switch(*++p)
{
case 'c':
fetch and print a character
break;
case 'd':
fetch and print an integer
break;
case 's':
fetch and print a string
break;
case 'x':
print an integer, in hexadecimal
break;
case '%':
print a single %
break;
}
}
}
(For clarity, we've rearranged the former if/else statement slightly. If the character we're looking at is not %, we print it out and continue immediately with the next iteration of the for loop. This is a good example of the use of the continue statement. Everything else in the body of the loop then takes care of the case where we are looking at a %.)
Printing these various argument types out will be relatively straightforward. The $64,000 question, of course, is how to fetch the actual arguments. The answer involves some specialized macros defined for us by the standard header <stdarg.h>. The macros we will use are va_list, va_start(), va_arg(), and va_end(). va_list is a special "pointer" type which allows us to manipulate a variable-length argument list. va_start() begins the processing of an argument list, va_arg() fetches arguments from it, and va_end() finishes processing. (Therefore, va_list is a little bit like the stdio FILE * type, and va_start is a bit like fopen.)
Here is the final version of our myprintf function, illustrating the fetching, formatting, and printing of the various argument types. (For simplicity--of presentation, if nothing else--the formatting step is deferred to a version of the nonstandard but popular itoa function.)
#include <stdio.h>
#include <stdarg.h>
extern char *itoa(int, char *, int);
void myprintf(const char *fmt, ...)
{
const char *p;
va_list argp;
int i;
char *s;
char fmtbuf[256];
va_start(argp, fmt);
for(p = fmt; *p != '\0'; p++)
{
if(*p != '%')
{
putchar(*p);
continue;
}
switch(*++p)
{
case 'c':
i = va_arg(argp, int);
putchar(i);
break;
case 'd':
i = va_arg(argp, int);
s = itoa(i, fmtbuf, 10);
fputs(s, stdout);
break;
case 's':
s = va_arg(argp, char *);
fputs(s, stdout);
break;
case 'x':
i = va_arg(argp, int);
s = itoa(i, fmtbuf, 16);
fputs(s, stdout);
break;
case '%':
putchar('%');
break;
}
}
va_end(argp);
}
Looking at the new lines, we have:
#include <stdarg.h>
This header file is required in any file which uses the variable argument list (va_) macros.
va_list argp;
This line declares a variable, argp, which we use while manipulating the variable-length argument list. The type of the variable is va_list, a special type defined for us by <stdarg.h>.
va_start(argp, fmt);
This line initializes argp and initiates the processing of the argument list. The second argument to va_start() is simply the name of the function's last fixed argument. va_start() uses this to figure out where the variable arguments begin.
i = va_arg(argp, int);
And here's the heart of the matter. va_arg() fetches the next argument from the argument list. The second argument to va_arg() is the type of the argument we expect. Notice carefully that we must supply this argument, which implies that we must somehow know what type of argument to expect next. The variable-length argument list machinery does not know. In this case, we know what the type of the next argument should be because it's supposed to match the format character we're processing. We can see, then, why such havoc results when printf's arguments do not match its format string: printf tells the va_arg machinery to grab an argument of one type, with the type determined by one of the format specifiers, but since the va_arg machinery doesn't know what the actual argument type is, there's no way for it to do any automatic conversion. If the actual argument has the right type for the va_arg call which grabs it (as of course it's supposed to), it works, otherwise it doesn't.
(You may have noticed that we fetched the character to print for %c as an int, not a char. That's deliberate, and is explained in the next section.)
s = va_arg(argp, char *);
Here's another invocation of va_arg(), this time fetching a string, represented as a character pointer, or char *.
va_end(argp);
Finally, when we're all finished processing the argument list, we call va_end(), which performs any necessary cleanup.
When a function with a variable-length argument list is called, the variable arguments are passed using C's old "default argument promotions." These say that types char and short int are automatically promoted to int, and type float is automatically promoted to double. Therefore, varargs functions will never receive arguments of type char, short int, or float. Furthermore, it's an error to "pass" the type names char, short int, or float as the second argument to the va_arg() macro. Finally, for vaguely related reasons, the last fixed argument (the one whose name is passed as the second argument to the va_start() macro) should not be of type char, short int, or float, either.
A frequently-asked question is, "How can I determine how many arguments my function was actually called with?" The answer, as discussed above, is that you (or your code) must figure it out somehow, generally by looking at the arguments themselves (or, in the case of printf, by using clues which are designed into the first, fixed arguments). There is no Standard way of asking the compiler or run-time system how many arguments were actually passed, or what their types are.
The macros va_start() and va_arg() are referred to as macros because they can't possibly be functions. va_start() initializes (that is, sets the value of) its first argument, and it uses its second argument not as a value but as a location. Even more unusually, va_arg() accepts a type name as its second argument, which no function in C, indeed no anything in C other than sizeof, can ever do. Finally, va_arg() has no one return type (as it would have to if it were a function); the type it "returns" is determined by its second argument.
The va_list type, whatever it is, is a mostly normal type. In particular, you can pass it on to other functions, and it is frequently quite useful to do so. For example, suppose that you want to write an error function, which will print nicely annotated error messages complete with filenames, line numbers, severity indicators, and the like. Furthermore, suppose that the rest of your program will find it useful, when calling this error function, to embed % sequences in the string to be printed, requesting that extra arguments be interpolated, just like printf. The obvious question is, will the error function have to duplicate all of printf's code for parsing format strings and formatting variable arguments (which isn't impossible, as we've seen), or can it somehow call on printf or a related function to do most of the work?
To be precise, here's the outline of the error function we'd like to write:
extern char *filename; /* current input file name */
extern int lineno; /* current line number */
void error(char *msg, ...)
{
fprintf(stderr, "%s, line %d: error:", filename, lineno);
fprintf(stderr, msg, what goes here? );
fprintf(stderr, "\n");
}
The tricky line is the second call to fprintf. We have the string, msg, we want to print, possibly containing % characters. How do we pass down to fprintf the extra arguments which our caller passed to us?
The answer is that we don't; there's no way to say "call this function with the same arguments I got, however many of them there are." (The run-time system simply doesn't have enough information to do this sort of thing, which is why it can't tell you how many arguments you got called with, either.) However, there's a variant version of fprintf which is designed for just this sort of situation. The variant is called vfprintf (where the v stands for "varargs"), and a call to it looks something like this:
void error(char *msg, ...)
{
va_list argp;
fprintf(stderr, "%s, line %d: error:", filename, lineno);
va_start(argp, msg);
vfprintf(stderr, msg, argp);
va_end(argp);
fprintf(stderr, "\n");
}
We declare a local variable of type va_list and call va_start(), as before. However, all we do with our argp variable is pass it to vfprintf as its third argument. vfprintf then does all the work--if we could look inside it, it would look a lot like our version of printf above, except that vfprintf does not call va_start(), because its caller already has. Notice that vfprintf does not accept a variable number of arguments; it accepts exactly three arguments, the third of which is essentially a "pointer" to the extra arguments it will need.
There are also "varargs" versions of printf and sprintf, namely vprintf and vsprintf. These follow the same pattern, accepting a single last argument of type va_list in lieu of an actual variable-length argument list.
(Notice that the error function above also called va_end(). This makes sense, since error was the one who called va_start(). The above pattern works, but more complicated ones may not. For example, it's not guaranteed that you can pick a few arguments off of a va_list, pass the va_list to a subfunction to pick a few more off, and then pick the last ones off yourself. Also, there's no direct way to "rewind" a va_list, although it's permissible to call va_end() and then va_start() again, to start over again.)