Chapter 16: The Standard I/O (stdio) Library

In chapter 12, we met several of the functions in C's Standard I/O library (often called stdio, sometimes pronounced "studio", named after the header file <stdio.h> which declares its routines). In this chapter, we'll describe most of the functions and other facilities available in <stdio.h>, and explain how they're useful.

(Note: this is an uncharacteristically long and complete chapter. It tries to describe just about all of the Standard I/O library, including features you aren't likely to be using for a while. Don't feel you have to understand every word in this chapter--when you get to an obscure part, just skim through it to get an idea of what's available, and come back and read it again as you have occasion to use the feature.)

16.1: Files and Streams

16.2: Opening and Closing Files (fopen, fclose, etc.)

16.3: Character Input and Output (getchar, putchar, etc.)

16.4: Line Input and Output (fgets, fputs, etc.)

16.5: Formatted Output (printf and friends)

16.6: Formatted Input (scanf)

16.7: Arbitrary Input and Output (fread, fwrite)

16.8: EOF and Errors

16.9: Random Access (fseek, ftell, etc.)

16.10: File Operations (remove, rename, etc.)

16.11: Redirection (freopen)


Read sequentially: prev next top

16.1: Files and Streams

Since the beginning, we've been using "standard input" and "standard output," two predefined I/O streams which are available to every C program. The disposition of these streams is left deliberately unclear: the program can assume that they're connected to the "right place"; usually (for an interactive program) to the user's keyboard and screen, respectively. However, since a program typically doesn't know exactly where they go, it's possible to redirect them, behind the program's back, and thereby to apply a program to some noninteractive input or to capture its output, without rewriting the program or doing any special I/O programming. (This ability is a cornerstone of the Unix "toolkit" methodology. In Unix and several other systems, you can redirect the input or output of a program as you invoke it from the shell command line using the < or > characters.)

Standard input is assumed by functions like getchar, and standard output is assumed by functions like putchar and printf.

Of course, it's also possible to open files (or other I/O sources) explicitly. We can open files using the function fopen; certain systems may also provide specialized ways of opening streams connected to I/O devices or set up in more exotic ways. A successful call to fopen returns a pointer of type FILE *, that is, "pointer to FILE," where FILE is a special type defined by <stdio.h>. A FILE * (also called "file pointer") is the handle by which we refer to an I/O stream in C. I/O functions which do not assume standard input or standard output all accept a FILE * argument telling them which stream to read from or write to. (Examples are getc, putc, and fprintf.) Notice that a file pointer which has been opened on a file is not the same thing as the file itself. A file pointer is a data structure which helps us access or manipulate the file.

Occasionally it is necessary to refer to the standard input or standard output in a situation which calls for a general-purpose FILE *. To handle these cases, there are two predefined constants: stdin and stdout. Both of these are of type FILE *, are declared in <stdio.h>, and can be used wherever a FILE * is required. (For example, we could simulate--or, in fact, implement--getchar as getc(stdin).)

There is also a third predefined stream, the "standard error output." It has its own constant, stderr. By default, stderr is typically connected to the same output device as stdout; the difference between stdout and stderr is that stderr is not redirected when stdout is. stderr is, as its name implies, intended for error messages: if your program printed its error messages to stdout (e.g. by calling printf), they would disappear into the output file if the user redirected the standard output. Therefore, it's customary to print error messages (and also prompts, or anything else that shouldn't be redirected) to stderr, often by calling fprintf(stderr, message, ...).


16.2: Opening and Closing Files (fopen, fclose, etc.)

As mentioned, the fopen function opens a file (or perhaps some other I/O object, if the operating system permits devices to be treated as if they were files) and returns a stream (FILE *) to be used with later I/O calls. fopen's prototype is

	FILE *fopen(char *filename, char *mode)

For the rest of this chapter, we'll often use prototype notations like these to describe functions, since a prototype gives us just the information we need about a function: its name, its return type, and the types of its arguments (perhaps along with identifying names for the arguments).

fopen's prototype tells us that it returns a FILE *, as we expect, and that it takes two arguments, both of type char * (i.e. "string"). The first string is the file name, which can be any string (simple filename or complicated pathname) which is acceptable to the underlying operating system. The second string is the mode in which the file should be opened. The simple mode arguments are

You can also tack two optional modifier characters onto the mode string: Modes "r+" and "w+" let you read and write to the file. You can't read and write at the same time; between stints of reading and stints of writing you must explicitly reposition the read/write indicator (see section 16.9 below).

"Binary" or "b" mode means that no translations are done by the stdio library when reading or writing the file. Normally, the newline character \n is translated to and from some operating system dependent end-of-line representation (LF on Unix, CR on the Macintosh, CRLF on MS-DOS, or an end-of-record mark on record-oriented operating systems). On MS-DOS systems, without binary mode, a control-Z character in a file is treated as an end-of-file indication; neither the control-Z nor any characters following it will be seen by a program reading the file. In binary mode, on the other hand, characters are read and written verbatim; newline characters (and, in MS-DOS, control-Z characters) are not treated specially. You need to use binary mode when what you're reading and writing is arbitrary byte values which are not to be interpreted as text characters.

Of course, it's possible to use both optional modes: "r+b", "w+b", etc. (For maximum portability, it's preferable to put + before b in these cases.)

If, for any reason, fopen can't open the requested file in the requested mode, it returns a null pointer (NULL). Whenever you call fopen, you must check that the returned file pointer is not null before attempting to use the pointer for any I/O.

Most operating systems let you keep only a limited number of files open at a time. Also, many versions of the stdio library allocate only a limited number of FILE structures for fopen to return pointers to. Therefore, if a program opens many files in sequence, it's important for it to close them as it finishes with them. Closing a file fp simply requires calling fclose(fp). (Any open streams are automatically closed when the program exits normally.)

The standard I/O library normally buffers characters--that is, when you're writing, it saves up a chunk of characters and then writes them to the actual file all at once; and when you're reading, it reads a chunk of characters from the file all at once and them parcels them out to the program one at a time (or as many characters at a time as the program asks for). The reasons for buffering have to do with efficiency--the calls to the underlying operating system which request it to read and write files may be inefficient if called once for each character or each few characters, and may be much more efficient if they're always called for large blocks of characters. Normally, buffering is transparent to your program, but occasionally it's necessary to ensure that some characters have actually been written. (One example is when you've printed a prompt to the standard output, and you want to be sure that it's actually been written to the screen.) In these cases, you can call fflush(fp), which flushes a stream's buffered output to the underlying file (or screen, or other device). Naturally, the library automatically flushes output when you call fclose on a stream, and also when your program exits.

(fflush is only defined for output streams. There is no standard way to discard unread, buffered input.)


16.3: Character Input and Output (getchar, putchar, etc.)

Character-at-a-time input and output is simple and straightforward. The getchar function reads the next character from the standard input; getc(fp) reads the next character from the stream fp. Both return the next character or, if the next character can't be read, the non-character constant EOF, which is defined in <stdio.h>. (Usually the reason that the next character can't be read is that the input stream has reached end-of-file, but it's also possible that there's been some I/O error.) Since the value EOF is distinct from all character values, it's important that the return value from getc and getchar be assigned to a variable of type int, not char. Don't declare the variable to hold getc's or getchar's return value as a char; don't try to read characters directly into a character array with code like

	while(i < max && (a[i] = getc(fp)) != EOF)	/* WRONG, for char a[] */

The code may seem to work at first, but some day it will get confused when it reads a real character with a value which seems to equal that which results when the non-char value EOF is crammed into a char.

One more reminder about getchar: although it returns and therefore seems to read one character at a time, it typically delivers characters from internal buffers which may hold more characters which will be delivered later. For example, most command-line-based operating systems let you type an entire line of input, and wait for you to type the RETURN or ENTER key before making any of those characters available to a program (even if the program thought it was doing character-at-a-time input with calls to getchar). There are, of course, ways to read characters immediately (without waiting for the RETURN key), but they differ from operating system to operating system.

Writing single characters is just as easy as reading: putchar(c) writes the character c to standard output; putc(c, fp) writes the character c to the stream fp. (The character c must be a real character. If you want to "send" an end-of-file condition to a stream, that is, cause the program reading the stream to "get" end-of-file, you do that by closing the stream, not by trying to write EOF to it.)

Occasionally, when reading characters, you sometimes find that you've read a bit too far. For example, if one part of your code is supposed to read a number--a string of digits--from a file, leaving the characters after the digits on the input stream for some other part of the program to read, the digit-reading part of the program won't know that it has read all the digits until it has read a non-digit, at which point it's too late. (The situation recalls Dave Barry's recipe for "food heated up": "Put the food in a pot on the stove on medium heat until just before the kitchen fills with black smoke.") When reading characters with the standard I/O library, at least, we have an escape: the ungetc function "un-reads" one character, pushing it back on the input stream for a later call to getc (or some other input function) to read. The prototype for ungetc is

	int ungetc(int c, FILE *fp)

where c is the character which is to be pushed back onto the stream fp. For example, here is a code scrap that reads digits from a stream (and converts them to the corresponding integer), stopping at the first non-digit character and leaving it on the input stream:

#include <ctype.h>

int c, n = 0;
while((c = getchar()) != EOF && isdigit(c))
	n = 10 * n + (c - '0');
if(c != EOF)
	ungetc(c, stdin);

It's only guaranteed that you can push one character back, but that's usually all you need.


16.4: Line Input and Output (fgets, fputs, etc.)

The function

	char *gets(char *line)

reads the next line of text (i.e. up to the next \n) from the standard input and places the characters (except for the \n) in the character array pointed to by line. It returns a pointer to the line (that is, it returns the same pointer value you gave it), unless it reaches end-of-file, in which case it returns a null pointer. It is assumed that line points to enough memory to hold all of the characters read, plus a terminating \0 (so that the line will be usable as a string). Since there's usually no way for anyone to guarantee that the array is big enough, and no way for gets to check it, gets is actually a useless function, and no serious program should call it.

The function

	char *fgets(char *line, int max, FILE *fp)

is somewhat more useful. It reads the next line of input from the stream fp and places the characters, including the \n, in the character array pointed to by line. The second argument, max, gives the maximum number of characters to be written to the array, and is usually the size of the array. Like gets, fgets returns a pointer to the line it reads, or a null pointer if it reaches end-of-file. Unlike gets, fgets does include the \n in the string it copies to the array. Therefore, the number of characters in the line, plus the \n, plus the \0, will always be less than or equal to max. (If fgets reads max-1 characters without finding a \n, it stops reading there, copies a \0 to the last element of the array, and leaves the rest of the line to be read next time.) Since fgets does let you guarantee that the line being read won't go off the end of the array, you should always use fgets instead of gets. (If you want to read a line from standard input, you can just pass the constant stdin as the third argument.) If you'd rather not have the \n retained in the input line, you can either remove it right after calling fgets (perhaps by calling strchr and overwriting the \n with a \0), or maybe call the getline or fgetline function we've been using instead. (See chapters 6 and 12; these functions are also handy in that they return the length of the line read. They differ from fgets in their treatment of overlong lines, though.)

The function

	int puts(char *line)

writes the string pointed to by line to the standard output, and writes a \n to terminate it. It returns a nonnegative value (we don't really care what the value is) unless there's some kind of a write error, in which case it returns EOF.

Finally, the function

	int fputs(char *line, FILE *fp)

writes the string pointed to by line to the stream fp. Like puts, fputs returns a nonnegative value or EOF on error. Unlike puts, fputs does not automatically append a \n.


16.5: Formatted Output (printf and friends)

C's venerable printf function, which we've been using since day one, prints or writes formatted output to the standard output. As we've seen (by example, if not formally), printf's operation is controlled by its first, "format" argument, which is either a simple string to be printed or a string containing percent signs and other characters which cause the formatted values of printf's other arguments to be interspersed with the other text (if any) of the format string.

So far, we've been using simple format specifiers such as %d, %f, and %s. But format specifiers can actually consist of several parts. You can specify a "field width"; for example, %5d prints an integer in a field five characters wide, padding the integer's value with extra characters if necessary so that at least five characters are printed. You can specify a "precision"; for example, %.2f formats a floating-point number with two digits printed after the decimal point. You can also add certain characters which specify various options, such as how a too-narrow field is to be padded out to its field width, or what type of number you're printing. For example, %-5d indicates that the padding characters should be added after the field's value (so that it's left-justified), and %ld indicates that you're printing a long instead of a plain int.

Formally, then, the complete framework for a printf format specifier looks like

	% flags width . precision modifier character

where all of the parts except the % and the final character are optional.

The width gives the minimum overall width of the output (the field) generated by this format specifier. If the output (the number of digits or characters) would be less than the width, it will be padded on the right (or left, if the - flag is present), usually with spaces. If the output for the field ends up being larger than the specified width, however, the field essentially overflows or grows; the output is not truncated or anything. That is, printf("%2d", 12345) prints 12345.

The precision is either:

For example, printf("%.3s", "Hello, world!") prints Hel, and printf("%.5d", 12) prints 00012.

Either the width or the precision (or both) can be specified as *, which indicates that the next int argument from the argument list should be used as the field width or precision. For example, printf("%.*f", 2, 76.54321) prints 76.54.

The flags are a few optional characters which modify the conversion in some way. They are:

The modifier specifies the size of the corresponding argument: l for long int, h for short int, L for long double.

Finally, the format character controls the overall appearance of the conversion (and, along with the modifier, specifies the type of the corresponding argument). We've seen many of these already. The complete list of format characters is:

When you want to print to an arbitrary stream, instead of standard output, just use fprintf, which takes a leading FILE * argument:

	fprintf(stderr, "Syntax error on line %d\n", lineno);

Sometimes, it's useful to do some printf-like formatting, but not output the string right away. The sprintf function is a printf variant which "prints" to an in-memory string rather than to a FILE * stream. For example, one way to convert an integer to a string (the opposite of atoi) is:

	int i = 123;
	char string[20];
	sprintf(string, "%d", i);

One thing to be careful of with sprintf, though, is that it's up to you to make sure that the destination string is big enough.


16.6: Formatted Input (scanf)

Just as putchar has its getchar and fputs has its fgets, there's an input analog to printf, namely scanf. scanf reads characters from standard input, under control of a format string, perhaps converting some components of the string and storing them into variables. For example, just as you could use the call

	printf("(%d, %d)", x, y);

to print two integer values and some surrounding punctuation, you could use the call

	scanf("(%d, %d)", &x, &y);

to attempt to extract two integer values from some input containing similar punctuation.

scanf interprets a format string, much like printf, with the first difference being that scanf attempts to read characters and match them against the format string, rather than printing under control of the format string. For each ordinary character in the format string, scanf expects to see that character on the input; if not, it fails. For each format specifier in the input string, scanf attempts to match and convert a string appropriate to the format specifier, storing the converted result into a variable pointed to by the corresponding argument. If it can't find any characters matching the format specifier, it fails.

Since scanf "returns" many values (one for each format specifier in the format string), it must do so using pointers which the caller passes. For each value to be converted, the caller passes a pointer to the variable (or other location) where scanf should write the converted value. All arguments passed to scanf must be pointers.

The format strings used by scanf are similar to those used by printf, but there are several differences.

The optional width gives the maximum number of characters to read while performing the conversion requested by a particular format specifier. (If there are many adjacent characters which could satisfy a request--many digits for one of the numeric conversions, or many characters for %s conversion--the width keeps scanf from gobbling all of them up at once.)

There is no equivalent to the precision modifier.

If the * flag appears, it indicates that the converted value should be discarded, not written to a location pointed to by one of the pointers in the argument list. (In other words, there is no corresponding argument.) Since * is usurped for this function, there is no way to use a variable field width from the argument list with scanf. There are no other flags.

The modifier characters are more significant. An h indicates that the corresponding integer pointer argument (for %d, %u, %o, or %x) is a short int * or unsigned short int *. An l indicates that the corresponding integer pointer argument (for %d, %u, %o, or %x) is a long int * or unsigned long int *, or that the floating-point pointer argument (for %e, %f, or %g) is a double * rather than a float *. (Similarly, an L indicates a long double *.)

The %c format will read more than one character if an explicit width greater than 1 is specified. The corresponding argument must be a pointer to enough space to hold all the characters read.

The %e, %f, and %g formats all read strings in either scientific notation or conventional decimal fraction m.n notation. (In other words, the three formats act just the same.) However, they assume a float * argument unless the l modifier appears, in which case they expect a double *. (This is in contrast to printf, which accepts either float or double arguments for %e, %f, and %g, due to the default argument promotions.)

The %i format will read a number in decimal, octal, or hexadecimal, taking a leading 0 to indicate octal and a leading 0x (or 0X) to indicate hexadecimal, i.e. the same rules as used by C constants.

The %n format causes the number of characters read so far (by this call to scanf) to be stored in the integer pointed to by the corresponding argument.

The %s format will read a string, up to the next whitespace character, and copy the string, terminated by a \0, to the corresponding argument, which must be a char *. The caller must ensure (perhaps by using an explicit width) that there is enough space to hold the received characters.

scanf has a special format specifier %[...], which matches any string composed of characters specified in the []. For example, %[abc] would match any string composed of a's, b's, and c's. The corresponding argument is a char *; the matched string is written to the location pointed to, followed by a \0. The caller must ensure (perhaps by using an explicit width) that there is enough space to hold the received characters. A second form, %[^...], matches a string of characters not found in the set. For example, scanf("(%[^)])", s) reads, into the string s, a string of characters (possibly including whitespace) from an input in which the string appears enclosed in parentheses. It may also be possible to specify ranges of characters (e.g. %[a-z], %[0-9], etc.), but these are not as portable.

With the exception of %c, %n, and %[, all of the conversion specifiers skip any leading whitespace (spaces, tabs, or newlines) which might precede the value or string converted. Also, any whitespace character in the format string matches any number of whitespace characters in the input. Therefore, the format "%d %d" would match the input "12 34" or "12 34" or "12\t34". However, the format "%d%d" would match all of these inputs as well, since the second %d first scans past any whitespace preceding the 34.

scanf returns the number of items it successfully converts and stores. It will return a number less than expected (less than the number of format specifiers not containing *, or less than the number of corresponding pointer arguments) if the conversion fails at any point, and it will leave any unrecognized characters (i.e. the ones that caused the last match to fail) waiting in the input for next time. scanf returns EOF if it encounters end-of-file before converting anything.

If you want to read characters from an arbitrary stream, you can use fscanf, which takes an initial FILE * argument.

You can scan and convert characters from a string (rather than from a stream) using sscanf. For example,

	int x, y;
	sscanf("12 34", "%d %d", &x, &y);

would place 12 in x and 34 in y.

scanf and fscanf are seductively useful, but they have a number of drawbacks in practice. They seem to make it very easy to, say, prompt the user for a number:

	int x;
	printf("Type a number:\n");
	scanf("%d", &x);

But what happens if the user fumbles, and types something other than a number? Even if the code checks scanf's return value, and prompts the user again if scanf returns 0, the non-numeric input remains on the input, and will be encountered by the next call to scanf unless some other steps are taken. (That is, scanf will rediscover the user's old, bad input before it gets to any new input.) It's also easy to write things like

	scanf("%d\n", &x);

but this code does not work as intended; the \n in the format string is a whitespace character, which asks scanf to discard one or more whitespace characters, so it will keep reading characters as long as they are whitespace characters, that is, it will read characters until it finds something that is not a whitespace character. It won't read that eventual whitespace character once it finds it, but in the process of looking for it it will seem to jam your program, since the call to scanf won't return right after the user types a number.

Therefore, it's much better to read interactive user input a line at a time,and then use functions like atoi (or perhaps sscanf) to interpret the line that the user typed.


16.7: Arbitrary Input and Output (fread, fwrite)

Sometimes, you want to read a chunk of characters, without treating it as a "line" (as gets and fgets do) and certainly without doing any scanf-like parsing. Similarly, you may want to write an arbitrary chunk of characters, not as a string or a line. (Furthermore, the chunk might contain one or more \0 characters which would otherwise terminate a string.) In these situations, you want fread and fwrite.

fread's prototype is

	size_t fread(void *buf, size_t sz, size_t n, FILE *fp)

Remember, void * is a "generic" pointer type (the type returned by malloc), which can point to anything. It may make it easier to think about fread at first if you imagine that its first argument were char *. size_t is a type we haven't met yet; it's a type that's guaranteed to be able to hold the size of any object (i.e. as returned by the sizeof operator); you can imagine for the moment that it's unsigned int. fread reads up to n objects, each of size sz, from the stream fp, and copies them to the buffer pointed to by buf. It reads them as a stream of bytes, without doing any particular formatting or other interpretation. (However, the default underlying stdio machinery may still translate newline characters unless the stream is open in binary or "b" mode). fread returns the number of items read. It returns 0 (not EOF) at end-of-file.

Similarly, the prototype for fwrite is

	size_t fwrite(void *buf, size_t sz, size_t n, FILE *fp)

fread and fwrite are intended to write chunks or "arrays" of items, with the interpretation that there are n items each of size sz. If what you want to do is read n characters, you can call fread with sz as 1, and buf pointing to an array of at least n characters. The return value will be in units of characters. (Of course, you could write n characters by using similar arguments with fwrite.)

Besides reading and writing "blocks" of characters, you can use fread and fwrite to do "binary" I/O. For example, if you have an array of int values:

	int array[N];

you could write them all out at once by calling

	fwrite(array, sizeof(int), N, fp);

This would write them all out in a byte-for-byte way, i.e. as a block copy of bytes from memory to the output stream, i.e. not as strings of digits as printf %d would. Since some of the bytes within the array of int might have the same value as the \n character, you would want to make sure that you had opened the stream in binary or "wb" mode when calling fopen.

Later, you could try to read the integers in by calling

	fread(array, sizeof(int), N, fp);

Similarly, if you had a variable of some structure type:

	struct somestruct x;

you could write it out all at once by calling

	fwrite(&x, sizeof(struct somestruct), 1, fp);

and read it in by calling

	fread(&x, sizeof(struct somestruct), 1, fp);

Although this "binary" I/O using fwrite and fread looks easy and convenient, it has a number of drawbacks, some of which we'll discuss in the next chapter.


16.8: EOF and Errors

When a function returns EOF (or, occasionally, 0 or NULL, as in the case of fread and fgets respectively), we commonly say that we have reached "end of file," but it turns out that it's also possible that there's been some kind of I/O error. When you want to distinguish between end-of-file and error, you can do so with the feof and ferror functions. feof(fp) returns nonzero (that is, "true") if end-of-file has been reached on the stream fp, and ferror(fp) returns nonzero if there has been an error. Notice the past tense and passive voice: feof returns nonzero if end-of-file has been reached. It does not tell you that the next attempt to read from the stream will reach end-of-file, but rather that the previous attempt (by some other function) already did. (If you know Pascal, you may notice that the end-of-file detection situation in C is therefore quite different from Pascal.) Therefore, you would never write a loop like

	while(!feof(fp))
		fgets(line, max, fp);

Instead, check the return value of the input function directly:

	while(fgets(line, max, fp) != NULL)

With a very few possible exceptions, you don't use feof to detect end-of-file; you use feof or ferror to distinguish between end-of-file and error. (You can also use ferror to diagnose error conditions on output files.)

Since the end-of-file and error conditions tend to persist on a stream, it's sometimes necessary to clear (reset) them, which you can do with clearerr(FILE *fp).

What should your program do if it detects an I/O error? Certainly, it cannot continue as usual; usually, it will print an error message. The simplest error messages are of the form

	fp = fopen(filename, "r");
	if(fp == NULL)
		{
		fprintf(stderr, "can't open file\n");
		return;
		}

or

	while(fgets(line, max, fp) != NULL)
		{
		... process input ...
		}

	if(ferror(fp))
		fprintf(stderr, "error reading input\n");

or

	fprintf(ofp, "%d %d %d\n", a, b, c);
	if(ferror(ofp))
		fprintf(stderr, "output write error\n");

Error messages are much more useful, however, if they include a bit more information, such as the name of the file for which the operation is failing, and if possible why it is failing. For example, here is a more polite way to report that a file could not be opened:

	#include <stdio.h>	/* for fopen */
	#include <errno.h>	/* for errno */
	#include <string.h>	/* for strerror */

	fp = fopen(filename, "r");
	if(fp == NULL)
		{
		fprintf(stderr, "can't open %s for reading: %s\n",
					filename, strerror(errno));
		return;
		}

errno is a global variable, declared in <errno.h>, which may contain a numeric code indicating the reason for a recent system-related error such as inability to open a file. The strerror function takes an errno code and returns a human-readable string such as "No such file" or "Permission denied".

An even more useful error message, especially for a "toolkit" program intended to be used in conjunction with other programs, would include in the message text the name of the program reporting the error.


16.9: Random Access (fseek, ftell, etc.)

Normally, files and streams (that is, anything accessed via a FILE *) are read and written sequentially. However, it's also possible to jump to a certain position in a file.

To jump to a position, it's generally necessary to have "been there" once already. First, you use the function ftell to find out what your position in the file is; then, later, you can use the function fseek to get back to a saved position.

File positions are stored as long ints. To record a position, you would use code like

	long int pos;
	pos = ftell(fp);

Later, you could "seek" back to that position with

	fseek(fp, pos, SEEK_SET);

The third argument to fseek is a code telling it (in this case) to set the position with respect to the beginning of the file; this is the mode of operation you need when you're seeking to a position returned by ftell.

As an example, suppose we were writing a file, and one of the lines in it contained the words "This file is n lines long", where n was supposed to be replaced by the actual number of lines in the file. At the time when we wrote that line, we might not know how many lines we'd eventually write. We could resolve the difficulty by writing a placeholder line, remembering where it was, and then going back and filling in the right number later. The first part of the code might look like this:

	long int nlinespos = ftell(fp);
	fprintf(fp, "This file is %4d lines long\n", 0);

Later, when we'd written the last line to the file, we could seek back and rewrite the "number-of-lines" line like this:

	ftell(fp, nlinespos, SEEK_SET);
	fprintf(fp, "This file is %4d lines long\n", nlines);

There's no way to insert or delete characters in a file after the fact, so we have to make sure that if we overwrite part of a file in this way, the overwritten text is exactly the same length as the previous text. That's why we used %4d, so that the number would always be printed in a field 4 characters wide. (However, since the field width in a printf format specifier is a minimum width, with this choice of width, the code would fail if a file ever had more than 9999 lines in it.)

Three other file-positioning functions are rewind, which rewinds a file to its beginning, and fgetpos and fsetpos, which are like ftell and fseek except that they record positions in a special type, fpos_t, which may be able to record positions in huge files for which even a long int might not be sufficient.

If you're ever using one of the "read/write" modes ("r+" or "w+"), you must use a call to a file-positioning function (fseek, rewind, or fsetpos) before switching from reading to writing or vice versa. (You can also call fflush while writing and then switch to reading, or reach end-of-file while reading and then switch back to writing.)

In binary ("b") mode, the file positions returned by ftell and used by fseek are byte offsets, such that it's possible to compute an fseek target without having to have it returned by an earlier call to ftell. On many systems (including Unix, the Macintosh, and to some extent MS-DOS), file positioning works this way in text mode as well. Code that relies on this isn't as portable, though, so it's not a good idea to treat ftell/fseek positions in text files as byte offsets unless you really have to.


16.10: File Operations (remove, rename, etc.)

You can delete a file by calling

	int remove(char *filename)

You can rename a file by calling

	int rename(char *oldname, char *newname)

Both of these functions return zero if they succeed and a nonzero value if they fail.

There are no standard C functions for dealing with directories (e.g. listing or creating them). On many systems, you will find functions mkdir for making directories and rmdir for removing them, and a suite of functions opendir, readdir, and closedir for listing them. Since these functions aren't standard, however, we won't talk about them here. (They exist on most Unix systems, but they're not standard under MS-DOS or Macintosh compilers, although you can find implementations on the net.)


16.11: Redirection (freopen)

For some programs, standard input and standard output are enough, and these programs can get by using just getchar, putchar, printf, etc., and letting any input/output redirection be handled by the user and the operating system (perhaps using command-line redirection such as < and >). Other programs handle all file manipulation themselves, opening files with fopen and maintaining file pointer (FILE *) variables recording the streams to which all input and output is done (with getc, putc, fprintf, etc.).

Occasionally, a program has to be rewritten in a hurry, to allow it to read or write a named file without manipulating file pointers and changing every call to getchar to getc, every call to printf to fprintf, etc. In these cases, the function freopen comes in handy: it reopens an existing stream on a new file. The prototype is

	FILE *freopen(char *filename, char *mode, FILE *fp)

freopen is about like fopen, except that rather than allocating a new stream, it uses (and returns) the caller-supplied stream fp. For example, to redirect a program's output to a file "from within," you could call

	freopen(filename, "w", stdout);

A disadvantage of freopen is that there's generally no way to undo it; you can't change your mind later and make stdin or stdout go back to where they had been before you called freopen. In situations where you want to be able to swich back and forth between streams, it's much better if you can chase down and change every call to getchar to getc, every call to printf to fprintf, etc., and then use some FILE * variable under your control (typically with a name like ifp or ofp) so that you can set it to point to a file by calling fopen, and later back to stdin or stdout by simply reassigning it.


Read sequentially: prev next top