File Management

Opening a File

The process of creating a stream linked to a disk file is called opening the file. When you open a file, it becomes available for reading (meaning that data is input from the file to the program), writing (meaning that data from the program is saved in the file), or both. When you're done using the file, you must close it. Closing a file is covered later in this chapter.

To open a file, you use the fopen() library function. The prototype of fopen() is located in STDIO.H and reads as follows:

FILE *fopen(const char *filename, const char *mode);

This prototype tells you that fopen() returns a pointer to type FILE, which is a structure declared in STDIO.H. The members of the FILE structure are used by the program in the various file access operations, but you don't need to be concerned about them. However, for each file that you want to open, you must declare a pointer to type FILE. When you call fopen(), that function creates an instance of the FILE structure and returns a pointer to that structure. You use this pointer in all subsequent operations on the file. If fopen() fails, it returns NULL. Such a failure could be caused, for example, by a hardware error or by trying to open a file on a diskette that hasn't been formatted.

The argument filename is the name of the file to be opened. As noted earlier, filename can--and should--contain a path specification. The filename argument can be a literal string enclosed in double quotation marks or a pointer to a string variable.

The argument mode specifies the mode in which to open the file. In this context, mode controls whether the file is binary or text and whether it is for reading, writing, or both. The permitted values for mode are listed in Table.

Table : Values of mode for the fopen() function

  • Mode r  : Opens the file for reading. If the file doesn't exist, fopen() returns NULL.
  • Mode w  : Opens the file for writing. If a file of the specified name doesn't exist, it is created. If a file of the specified name does exist, it is deleted without warning, and a new, empty file is created.
  • Mode a  : Opens the file for appending. If a file of the specified name doesn't exist, it is created. If the file does exist, new data is appended to the end of the file.
  • Mode r+ : Opens the file for reading and writing. If a file of the specified name doesn't exist, it is created. If the file does exist, new data is added to the beginning of the file, overwriting existing data.
  • Mode w+ : Opens the file for reading and writing. If a file of the specified name doesn't exist, it is created. If the file does exist, it is overwritten.
  • Mode a+ : Opens a file for reading and appending. If a file of the specified name doesn't exist, it is created. If the file does exist, new data is appended to the end of the file.

The default file mode is text. To open a file in binary mode, you append a b to the mode argument. Thus, a mode argument of a would open a text-mode file for appending, whereas ab would open a binary-mode file for appending.

Remember that fopen() returns NULL if an error occurs. Error conditions that can cause a return value of NULL include the following:

  • Using an invalid filename.
  • Trying to open a file on a disk that isn't ready (the drive door isn't closed or the disk isn't formatted, for example).
  • Trying to open a file in a nonexistent directory or on a nonexistent disk drive.
  • Trying to open a nonexistent file in mode r.

Whenever you use fopen(), you need to test for the occurrence of an error. There's no way to tell exactly which error occurred, but you can display a message to the user and try to open the file again, or you can end the program. Most C compilers include non-ANSI extensions that let you obtain information about the nature of the error; refer to your compiler documentation for information.

Example: Using fopen() to open disk files in various modes.

1:/* Example Program for the fopen() function. 
#include <stdio.h>
#include <stdlib.h>
main()
{
	FILE *fp;
	char filename[40], mode[4];

	while (1)
	{
		* Input filename and mode. */
		printf("\nEnter a filename: ");
		gets(filename);
		printf("\nEnter a mode (max 3 characters): ");
		ets(mode);
		/* Try to open the file. */
		if ( (fp = fopen( filename, mode )) != NULL )
		{
			printf("\nSuccessful opening %s in mode %s.\n",
			filename, mode);
			fclose(fp);
			puts("Enter x to exit, any other to continue.");
			if ( (getc(stdin)) == `x')
				break;
			else
				continue;
		}
		else
			fprintf(stderr, "\nError opening file %s in mode %s.\n",filename, mode);
		puts("Enter x to exit, any other to try again.");
		if ( (getc(stdin)) == `x')
			break;
		else
			continue;
	}
	}
}
Enter a filename:junk.txt
Enter a mode (max 3 characters):w
Successful opening junk.txt in mode w.
Enter x to exit, any other to continue.j
Enter a filename:morejunk.txt
Enter a mode (max 3 characters):r
Error opening morejunk.txt in mode r.

Enter x to exit, any other to try again.
x

File Opening Program Analysis:

This program prompts you for both the filename and the mode specified on lines 15 through 18. After getting the names, line 22 attempts to open the file and assign its file pointer to fp. As an example of good programming practice, the if statement on line 22 checks to see that the opened file's pointer isn't equal to NULL. If fp isn't equal to NULL, a message stating that the open was successful and that the user can continue is printed. If the file pointer is NULL, the else condition of the if loop executes. The else condition on lines 33 through 42 prints a message stating that there was a problem. It then prompts the user to determine whether the program should continue.

You can experiment with different names and modes to see which ones give you an error. In the output just shown, you can see that trying to open MOREJUNK.TXT in mode r resulted in an error because the file didn't exist on the disk. If an error occurs, you're given the choice of entering the information again or quitting the program. To force an error, you could enter an invalid filename such as [].

Writing and Reading File Data

A program that uses a disk file can write data to a file, read data from a file, or a combination of the two. You can write data to a disk file in three ways:

  • You can use the formatted output to save formatted data to a file. You should use formatted output only with text-mode files. The primary use of formatted output is to create files containing text and numeric data to be read by other programs such as spreadsheets or databases. You rarely, if ever, use the formatted output to create a file to be read again by a C program.
  • You can use character output to save single characters or lines of characters to a file. Although technically it's possible to use character output with binary mode files, it can be tricky. You should restrict character-mode output to text files. The main use of character output is to save text (but not numeric) data in a form that can be read by C, as well as other programs such as word processors.
  • You can use the direct output to save the contents of a section of memory directly to a disk file. This method is for binary files only. Direct output is the best way to save data for later use by a C program.

When you want to read data from a file, you have the same three options: formatted input, character input, or direct input. The type of input you use in a particular case depends almost entirely on the nature of the file being read. Generally, you will read data in the same mode that it was saved in, but this is not a requirement. However, reading a file in a mode different from the one it was written in requires a thorough knowledge of C and file formats.

The previous descriptions of the three types of file input and output suggest tasks best suited for each type of output. This is by no means a set of strict rules. The C language is very flexible (this is one of its advantages!), so a clever programmer can make any type of file output suit almost any need. As a beginning programmer, it might make things easier if you follow these guidelines, at least initially.

Formatted File Input and Output

Formatted file input/output deals with text and numeric data that is formatted in a specific way. It is directly analogous to formatted keyboard input and screen output done with the printf() and scanf() functions, as described on Day 14. I'll discuss formatted output first, followed by the input.

Formatted File Output

Formatted file output is done with the library function fprintf(). The prototype of fprintf() is in the header file STDIO.H, and it reads as follows:

int fprintf(FILE *fp, char *fmt, ...);

The first argument is a pointer to type FILE. To write data to a particular disk file, you pass the pointer that was returned when you opened the file with fopen().

The second argument is the format string. You learned about format strings in the discussion of printf() on Day 14. The format string used by fprintf() follows exactly the same rules as printf(). Refer to Day 14 for details.

The final argument is ... What does that mean? In a function prototype, ellipses represent a variable number of arguments. In other words, in addition to the file pointer and the format string arguments, fprintf() takes zero, one, or more additional arguments. This is just like printf(). These arguments are the names of the variables to be output to the specified stream.

Remember, fprintf() works just like printf(), except that it sends its output to the stream specified in the argument list. In fact, if you specify a stream argument of stdout, fprintf() is identical to printf().

Example: The equivalence of fprintf() formatted output to both a file and to stdout

/* Demonstrates the fprintf() function. */

#include <stdio.h>
#include <file.h>

void clear_kb(void);

main()
{
FILE *fp;
float data[5];
int count;
char filename[20];

puts("Enter 5 floating-point numerical values.");

for (count = 0; count < 5; count++)
scanf("%f", &data[count]);

/* Get the filename and open the file. First clear stdin */
/* of any extra characters. */

clear_kb();

puts("Enter a name for the file.");
gets(filename);

if ( (fp = fopen(filename, "w")) == NULL)
{
fprintf(stderr, "Error opening file %s.", filename);
exit(1);
}

/* Write the numerical data to the file and to stdout. */

for (count = 0; count < 5; count++)
{
fprintf(fp, "\ndata[%d] = %f", count, data[count]);
fprintf(stdout, "\ndata[%d] = %f", count, data[count]);
}
fclose(fp);
printf("\n");
return(0);
}

void clear_kb(void)
/* Clears stdin of any waiting characters. */
{
char junk[80];
gets(junk);
}
Enter 5 floating-point numerical values.
3.14159
9.99
1.50
3.
1000.0001

Enter a name for the file.
numbers.txt
data[0] = 3.141590
data[1] = 9.990000
data[2] = 1.500000
data[3] = 3.000000
data[4] = 1000.000122

Program Analysis:

You might wonder why the program displays 1000.000122 when the value you entered was 1000.0001. This isn't an error in the program. It's a normal consequence of the way C stores numbers internally. Some floating-point values can't be stored exactly, so minor inaccuracies such as this one sometimes result.

This program uses fprintf() on lines 37 and 38 to send some formatted text and numeric data to stdout and to the disk file whose name you specified. The only difference between the two lines is the first argument--that is, the stream to which the data is sent. After running the program, use your editor to look at the contents of the file NUMBERS.TXT (or whatever name you assigned to it), which will be in the same directory as the program files. You'll see that the text in the file is an exact copy of the text that was displayed on-screen.

Note that Listing 16.2 uses the clear_kb() function discussed on Day 14. This is necessary to remove from stdin any extra characters that might be left over from the call to scanf(). If you don't clear stdin, these extra characters (specifically, the newline) are read by the gets() that inputs the filename, and the result is a file creation error.

Formatted File Input

For formatted file input, use the fscanf() library function, which is used like scanf() (see Day 14), except that input comes from a specified stream instead of from stdin. The prototype for fscanf() is

int fscanf(FILE *fp, const char *fmt, ...);

The argument fp is the pointer to type FILE returned by fopen(), and fmt is a pointer to the format string that specifies how fscanf() is to read the input. The components of the format string are the same as for scanf(). Finally, the ellipses (...) indicate one or more additional arguments, the addresses of the variables where fscanf() is to assign the input.

Before getting started with fscanf(), you might want to review the section on scanf() on Day 14. The function fscanf() works exactly the same as scanf(), except that characters are taken from the specified stream rather than from stdin.

To demonstrate fscanf(), you need a text file containing some numbers or strings in a format that can be read by the function. Use your editor to create a file named INPUT.TXT, and enter five floating-point numbers with some space between them (spaces or newlines). For example, your file might look like this:

123.4587.001
100.02
0.004561.0005

Example: Using fscanf() to read formatted data from file

/* Reading formatted file data with fscanf(). */
#include <stdio.h>
#include <file.h>

main()
{
float f1, f2, f3, f4, f5;
FILE *fp;

if ( (fp = fopen("INPUT.TXT", "r")) == NULL)
{
fprintf(stderr, "Error opening file.\n");
exit(1);
}

fscanf(fp, "%f %f %f %f %f", &f1, &f2, &f3, &f4, &f5);
printf("The values are %f, %f, %f, %f, and %f\n.",
f1, f2, f3, f4, f5);

fclose(fp);
return(0);
}

The values are 123.45, 87.0001, 100.02, 0.00456, and 1.0005.

Program Analysis:

This program reads the five values from the file you created and then displays them on-screen. The fopen() call on line 10 opens the file for read mode. It also checks to see that the file opened correctly. If the file wasn't opened, an error message is displayed on line 12, and the program exits (line 13). Line 16 demonstrates the use of the fscanf() function. With the exception of the first parameter, fscanf() is identical to scanf(), which you have been using throughout this book. The first parameter points to the file that you want the program to read. You can do further experiments with fscanf(), creating input files with your programming editor and seeing how fscanf() reads the data.

Character Input and Output

When used with disk files, the term character I/O refers to single characters as well as lines of characters. Remember, a line is a sequence of zero or more characters terminated by the newline character. Use character I/O with text-mode files. The following sections describe character input/output functions, and then you'll see a demonstration program.

Character Input

There are three character input functions: getc() and fgetc() for single characters, and fgets() for lines.

The getc() and fgetc() Functions

The functions getc() and fgetc() are identical and can be used interchangeably. They input a single character from the specified stream. Here is the prototype of getc(), which is in STDIO.H:

int getc(FILE *fp);

The argument fp is the pointer returned by fopen() when the file is opened. The function returns the character that was input or EOF on error.

You've seen getc() used in earlier programs to input a character from the keyboard. This is another example of the flexibility of C's streams--the same function can be used for keyboard or file input.

If getc() and fgetc() return a single character, why are they prototyped to return a type int? The reason is that, when reading files, you need to be able to read in the end-of-file marker, which on some systems isn't a type char but a type int. You'll see getc() in action later, in Listing 16.10.

The fgets() Function

To read a line of characters from a file, use the fgets() library function. The prototype is

char *fgets(char *str, int n, FILE *fp);

The argument str is a pointer to a buffer in which the input is to be stored, n is the maximum number of characters to be input, and fp is the pointer to type FILE that was returned by fopen() when the file was opened.

When called, fgets() reads characters from fp into memory, starting at the location pointed to by str. Characters are read until a newline is encountered or until n-1 characters have been read, whichever occurs first. By setting n equal to the number of bytes allocated for the buffer str, you prevent input from overwriting memory beyond allocated space. (The n-1 is to allow space for the terminating \0 that fgets() adds to the end of the string.) If successful, fgets() returns str. Two types of errors can occur, as indicated by the return value of NULL:

  • If a read error or EOF is encountered before any characters have been assigned to str, NULL is returned, and the memory pointed to by str is unchanged.
  • If a read error or EOF is encountered after one or more characters have been assigned to str, NULL is returned, and the memory pointed to by str contains garbage.

You can see that fgets() doesn't necessarily input an entire line (that is, everything up to the next newline character). If n-1 characters are read before a newline is encountered, fgets() stops. The next read operation from the file starts where the last one leaves off. To be sure that fgets() reads in entire strings, stopping only at newlines, be sure that the size of your input buffer and the corresponding value of n passed to fgets() are large enough.

Character Output

You need to know about two character output functions: putc() and fputs().

The putc() Function

The library function putc() writes a single character to a specified stream. Its prototype in STDIO.H reads

int putc(int ch, FILE *fp);

The argument ch is the character to output. As with other character functions, it is formally called a type int, but only the lower-order byte is used. The argument fp is the pointer associated with the file (the pointer returned by fopen() when the file was opened). The function putc() returns the character just written if successful or EOF if an error occurs. The symbolic constant EOF is defined in STDIO.H, and it has the value -1. Because no "real" character has that numeric value, EOF can be used as an error indicator (with text-mode files only).

The fputs() Function

To write a line of characters to a stream, use the library function fputs(). This function works just like puts(), covered on Day 14. The only difference is that with fputs() you can specify the output stream. Also, fputs() doesn't add a newline to the end of the string; if you want it, you must explicitly include it. Its prototype in STDIO.H is

char fputs(char *str, FILE *fp);

The argument str is a pointer to the null-terminated string to be written, and fp is the pointer to type FILE returned by fopen() when the file was opened. The string pointed to by str is written to the file, minus its terminating \0. The function fputs() returns a nonnegative value if successful or EOF on error.

Direct File Input and Output

You use direct file I/O most often when you save data to be read later by the same or a different C program. Direct I/O is used only with binary-mode files. With direct output, blocks of data are written from memory to disk. Direct input reverses the process: A block of data is read from a disk file into memory. For example, a single direct-output function call can write an entire array of type double to disk, and a single direct-input function call can read the entire array from disk back into memory. The direct I/O functions are fread() and fwrite().

The fwrite() Function

The fwrite() library function writes a block of data from memory to a binary-mode file. Its prototype in STDIO.H is

int fwrite(void *buf, intsize, intcount, FILE *fp);

The argument buf is a pointer to the region of memory holding the data to be written to the file. The pointer type is void; it can be a pointer to anything.

The argument size specifies the size, in bytes, of the individual data items, and count specifies the number of items to be written. For example, if you wanted to save a 100-element integer array, size would be 2 (because each int occupies 2 bytes) andcount would be 100 (because the array contains 100 elements). To obtain the size argument, you can use the sizeof() operator.

The argument fp is, of course, the pointer to type FILE, returned by fopen() when the file was opened. The fwrite() function returns the number of items written on success; if the value returned is less than count, it means that an error has occurred. To check for errors, you usually program fwrite() as follows:

if( (fwrite(buf,size,count, fp)) != count)
fprintf(stderr, "Error writing to file.");

Here are some examples of using fwrite(). To write a single type double variable x to a file, use the following:

fwrite(&x, sizeof(double), 1, fp);

To write an array data[] of 50 structures of type address to a file, you have two choices:

fwrite(data, sizeof(address), 50, fp);
fwrite(data, sizeof(data), 1, fp);

The first method writes the array as 50 elements, with each element having the size of a single type address structure. The second method treats the array as a single element. The two methods accomplish exactly the same thing.

The following section explains fread() and then presents a program demonstrating fread() and fwrite().

The fread() Function

The fread() library function reads a block of data from a binary-mode file into memory. Its prototype in STDIO.H is

int fread(void *buf, intsize, intcount, FILE *fp);

The argument buf is a pointer to the region of memory that receives the data read from the file. As with fwrite(), the pointer type is void.

The argument size specifies the size, in bytes, of the individual data items being read, and count specifies the number of items to read. Note how these arguments parallel the arguments used by fwrite(). Again, the sizeof() operator is typically used to provide the size argu-ment. The argument fp is (as always) the pointer to type FILE that was returned by fopen() when the file was opened. The fread() function returns the number of items read; this can be less than count if end-of-file was reached or an error occurred.

Example : fwrite() and fread() for direct file access

/* Direct file I/O with fwrite() and fread(). */
#include <stdio.h>
#include <file.h>

#define SIZE 20

main()
{
int count, array1[SIZE], array2[SIZE];
FILE *fp;

/* Initialize array1[]. */

for (count = 0; count < SIZE; count++)
array1[count] = 2 * count;

/* Open a binary mode file. */

if ( (fp = fopen("direct.txt", "wb")) == NULL)
{
fprintf(stderr, "Error opening file.");
exit(1);
}
/* Save array1[] to the file. */

if (fwrite(array1, sizeof(int), SIZE, fp) != SIZE)
{
fprintf(stderr, "Error writing to file.");
exit(1);
}

fclose(fp);

/* Now open the same file for reading in binary mode. */

if ( (fp = fopen("direct.txt", "rb")) == NULL)
{
fprintf(stderr, "Error opening file.");
exit(1);
}

/* Read the data into array2[]. */

if (fread(array2, sizeof(int), SIZE, fp) != SIZE)
{
fprintf(stderr, "Error reading file.");
exit(1);
}

fclose(fp);

/* Now display both arrays to show they're the same. */

for (count = 0; count < SIZE; count++)
printf("%d\t%d\n", array1[count], array2[count]);
return(0);
}
00
22
44
66
88
1010
1212
1414
1616
1818
2020
2222
2424
2626
2828
3030
3232
3434
3636
3838

ANALYSIS:

These program demonstrates the use of the fwrite() and fread() functions. This program initializes an array on lines 14 and 15. It then uses fwrite() on line 26 to save the array to disk. The program uses fread() on line 44 to read the data into a different array. Finally, it displays both arrays on-screen to show that they now hold the same data (lines 54 and 55).When you save data with fwrite(), not much can go wrong besides some type of disk error. With fread(), you need to be careful, however. As far as fread() is concerned, the data on the disk is just a sequence of bytes. The function has no way of knowing what it represents. For example, on a 16-bit system, a block of 100 bytes could be 100 char variables, 50 int variables, 25 long variables, or 25 float variables. If you ask fread() to read that block into memory, it obediently does so. However, if the block was saved from an array of type int and you retrieve it into an array of type float, no error occurs, but you get strange results. When writing programs, you must be sure that fread() is used properly, reading data into the appropriate types of variables and arrays. Notice that in example, all calls to fopen(), fwrite(), and fread() are checked to ensure that they worked correctly.