< PREV | NEXT > | INDEX | SITEMAP | GOOGLE | LINKS | UPDATES | BLOG | EMAIL | $Donate? | HOME

[3.0] C Input & Output

v3.1.1 / chapter 3 of 5 / 01 sep 14 / greg goebel / public domain

* This chapter covers console (keyboard/display) and file I/O. You've already seen one console-I/O function, "printf()", and there are several others. C has two separate approaches toward file I/O, one based on library functions that is similar to console I/O, and a second that uses "system calls". These topics are discussed in detail below.


[3.1] C CONSOLE I/O
[3.2] C FILE-I/O THROUGH LIBRARY FUNCTIONS
[3.3] C FILE-I/O THROUGH SYSTEM CALLS

[3.1] C CONSOLE I/O

* Console I/O in general means communications with the computer's keyboard and display. However, in most modern operating systems the keyboard and display are simply the default input and output devices, and user can easily redirect input from, say, a file or other program and redirect output to, say, a serial I/O port:

   type infile > myprog > com

The program itself, "myprog", doesn't know the difference. The program uses console I/O to simply read its "standard input (stdin)" -- which might be the keyboard, a file dump, or the output of some other program -- and print to its "standard output (stdout)" -- which might be the display or printer or another program or a file. The program itself neither knows nor cares.

Console I/O requires the declaration:

   #include <stdio.h>

Useful functions include:

   printf()      Print a formatted string to stdout.
   scanf()       Read formatted data from stdin.
   putchar()     Print a single character to stdout.
   getchar()     Read a single character from stdin.
   puts()        Print a string to stdout.
   gets()        Read a line from stdin.

PC-based compilers also have an alternative library of console I/O functions. These functions require the declaration:

   #include <conio.h>

The three most useful PC console I/O functions are:

   getch()    Get a character from the keyboard (no need to press Enter).
   getche()   Get a character from the keyboard and echo it.
   kbhit()    Check to see if a key has been pressed. 

* The "printf()" function, as explained previously, prints a string that may include formatted data:

   printf( "This is a test!\n" );

-- which can include the contents of variables:

   printf( "Value1:  %d   Value2:  %f\n", intval, floatval );

The available format codes are:

   %d    decimal integer
   %ld   long decimal integer
   %c    character
   %s    string
   %e    floating-point number in exponential notation
   %f    floating-point number in decimal notation
   %g    use %e and %f, whichever is shorter
   %u    unsigned decimal integer
   %o    unsigned octal integer
   %x    unsigned hex integer

Using the wrong format code for a particular data type can lead to bizarre output. Further control can be obtained with modifier codes; for example, a numeric prefix can be included to specify the minimum field width:

   %10d

This specifies a minimum field width of ten characters. If the field width is too small, a wider field will be used. Adding a minus sign:

   %-10d

-- causes the text to be left-justified. A numeric precision can also be specified:

   %6.3f

This specifies three digits of precision in a field six characters wide. A string precision can be specified as well, to indicate the maximum number of characters to be printed. For example:

   /* prtint.c */

   #include <stdio.h>

   void main()
   {
     printf( "<%d>\n", 336 );
     printf( "<%2d>\n", 336 );
     printf( "<%10d>\n", 336 );
     printf( "<%-10d>\n", 336 );
   }

This prints:

   <336>
   <336>
   <       336>
   <336       >

Similarly:

   /* prfloat.c */

   #include <stdio.h>

   void main()
   {
     printf( "<%f>\n", 1234.56 );
     printf( "<%e>\n", 1234.56 );
     printf( "<%4.2f>\n", 1234.56 );
     printf( "<%3.1f>\n", 1234.56 );
     printf( "<%10.3f>\n", 1234.56 );
     printf( "<%10.3e>\n", 1234.56 );
   }

-- prints:

   <1234.560000>
   <1.234560e+03>
   <1234.56>
   <1234.6>
   <  1234.560>
   < 1.234e+03>

And finally:

   /* prtstr.c */

   #include <stdio.h>

   void main()
   {
     printf( "<%2s>\n", "Barney must die!" );
     printf( "<%22s>\n", "Barney must die!" );
     printf( "<%22.5s>\n", "Barney must die!" );
     printf( "<%-22.5s>\n", "Barney must die!" );
   }

-- prints:

   <Barney must die!>
   <      Barney must die!>
   <                 Barne>
   <Barne                 >

Just for convenience, the table of special characters listed in chapter 2 is repeated here. These characters can be embedded in "printf" strings:

   '\a'     alarm (beep) character
   '\p'     backspace
   '\f'     formfeed
   '\n'     newline
   '\r'     carriage return
   '\t'     horizontal tab
   '\v'     vertical tab
   '\\'     backslash
   '\?'     question mark
   '\''      single quote
   '\"'     double quote
   '\0NN'   character code in octal
   '\xNN'   character code in hex
   '\0'     null character

The "scanf()" function reads formatted data using a syntax similar to that of "printf", except that it requires pointers as parameters, since it has to return values. For example:

   /* cscanf.c */

   #include <stdio.h>

   void main()
   {
     int val;
     char name[256];
   
     printf( "Enter your age and name.\n" );
     scanf( "%d %s", &val, name ); 
     printf( "Your name is: %s -- and your age is: %d\n", name, val );
   }

There is no "&" in front of "name", since the name of a string is already a pointer. Input fields are separated by whitespace (space, tab, or newline), though a count, for example "%10d", can be included to define a specific field width. Formatting codes are the same as for "printf()", except:

If characters are included in the format code, "scanf()" will read in the characters and discard them. For example, if the example above were modified as follows:

   scanf( "%d,%s", &val, name );

-- then "scanf()" will assume that the two input values are comma-separated and swallow the comma when it is encountered.

If a format code is preceded with an asterisk, the data will be read and discarded. For example, if the example were changed to:

   scanf( "%d%*c%s", &val, name );

-- then if the two fields were separated by a ":", that character would be read in and discarded.

The "scanf()" function will return the value EOF (an "int"), defined in "stdio.h", when its input is terminated.

* The "putchar()" and "getchar()" functions handle single character I/O. For example, the following program accepts characters from standard input one at a time:

   /* inout.c */

   #include <stdio.h>

   void main ()
   {
     unsigned int ch; 
   
     while ((ch = getchar()) != EOF)
     {
       putchar( ch ); 
     }
   }

The "getchar" function returns an "int" and also terminates with an EOF. Notice the neat way C allows a program to get a value and then test it in the same expression, a particularly useful feature for handling loops.

One word of warning on single-character I/O: if a program is reading characters from the keyboard, most operating systems won't send the characters to the program until the user presses the "Enter" key, meaning it's not possible to perform single-character keyboard I/O this way.

The little program above is the essential core of a character-mode text "filter", a program that can perform some transformation between standard input and standard output. Such a filter can be used as an element to construct more sophisticated applications:

   type file.txt > filter1 | filter2 > outfile.txt

The following filter capitalizes the first character in each word in the input. The program operates as a "state machine", using a variable that can be set to different values, or "states", to control its operating mode. It has two states: SEEK, in which it is looking for the first character, and REPLACE, in which it is looking for the end of a word.

In SEEK state, it scans through whitespace (space, tab, or newline), echoing characters. If it finds a printing character, it converts it to uppercase and goes to REPLACE state. In REPLACE state, it converts characters to lowercase until it hits whitespace, and then goes back to SEEK state.

The program uses the "tolower()" and "toupper()" functions to make case conversions. These two functions will be discussed in the next chapter.

   /* caps.c */

   #include <stdio.h>
   #include <ctype.h>

   #define SEEK 0
   #define REPLACE 1

   void main()
   {
     int ch, state = SEEK;
     while(( ch = getchar() ) != EOF )
     {
       switch( state )
       {
       case REPLACE:
         switch( ch )
         {
         case ' ':
         case '\t':
         case '\n':   state = SEEK;
                      break;
         default:     ch = tolower( ch );
                      break;
         }
         break;
       case SEEK:
         switch( ch )
         {
         case ' ':
         case '\t':
         case '\n':   break;
         default:     ch = toupper( ch );
                      state = REPLACE;
                      break;
         }
       }
       putchar( ch );
     }
   }

* The "puts()" function is like a simplified version of "printf()" without format codes. It prints a string that is automatically terminated with a newline:

   puts( "Hello world!" );

The "gets()" function is particularly useful: it reads a line of text terminated by a newline, though it doesn't read the newline into the string. It is much less finicky about its inputs than "scanf()":

   /* cgets.c */

   #include <stdio.h>
   #include <string.h>
   #include <stdlib.h>

   void main()
   {
     char word[256], 
          *guess = "blue";
     integer i, n = 0;

     puts( "Guess a color (use lower case please):" );
     while( gets( word ) != NULL )
     {
       if( strcmp( word, guess ) == 0 )
       {
          puts( "You win!" );
          exit( 0 );
       }
       else
       {
          puts( "No, try again." );
       }
     }
   }

This program includes the "strcmp" function, which performs string comparisons and returns 0 on a match. This function is described in more detail in the next chapter.

These functions can be used to implement filters that operate on lines of text, instead of characters. A core program for such filters follows:

   /* lfilter.c */

   #include <stdio.h>

   void main ()
   {
     char b[256];
     while (( gets( b ) ) != NULL )
     {
       puts( b ); 
     }
   }

The "gets()" function returns a NULL, defined in "stdio.h", on input termination or error.

* The PC-based console-I/O functions "getch()" and "getche()" operate much as "getchar()" does, except that "getche()" echoes the character automatically.

The "kbhit()" function is very different in that it only indicates if a key has been pressed or not. It returns a nonzero value if a key has been pressed, and zero if it hasn't. This allows a program to poll the keyboard for input, instead of hanging on keyboard input and waiting for something to happen. As mentioned, these functions require the "conio.h" header file, not the "stdio.h" header file.

BACK_TO_TOP

[3.2] C FILE-I/O THROUGH LIBRARY FUNCTIONS

* The file-I/O library functions are much like the console-I/O functions. In fact, most of the console-I/O functions can be thought of as special cases of the file-I/O functions. The library functions include:

   fopen()      Create or open a file for reading or writing.
   fclose()     Close a file after reading or writing it.

   fseek()      Seek to a certain location in a file.
   rewind()     Rewind a file back to its beginning and leave it open.
   rename()     Rename a file.
   remove()     Delete a file.

   fprintf()    Formatted write.
   fscanf()     Formatted read.
   fwrite()     Unformatted write.
   fread()      Unformatted read.
   putc()       Write a single byte to a file.
   getc()       Read a single byte from a file.
   fputs()      Write a string to a file.
   fgets()      Read a string from a file.

All these library functions depend on definitions made in the "stdio.h" header file, and so require the declaration:

   #include <stdio.h>

C documentation normally refers to these functions as performing "stream I/O", not "file I/O". The distinction is that they could just as well handle data being transferred through a modem as a file, and so the more general term "data stream" is used rather than "file". However, we'll stay with the "file" terminology in this document for the sake of simplicity.

* The "fopen()" function opens and, if need be, creates a file. Its syntax is:

   <file pointer> = fopen( <filename>, <access mode> );

The "fopen()" function returns a "file pointer", declared as follows:

   FILE *<file pointer>;

The file pointer will be returned with the value NULL, defined in "stdio.h", if there is an error. The "access modes" are defined as follows:

   r     Open for reading.
   w     Open and wipe (or create) for writing.
   a     Append -- open (or create) to write to end of file.
   r+    Open a file for reading and writing.
   w+    Open and wipe (or create) for reading and writing.
   a+    Open a file for reading and appending.

The "filename" is simply a string of characters.

It is often useful to use the same statements to communicate either with files or with standard I/O. For this reason, the "stdio.h" header file includes predefined file pointers with the names "stdin" and "stdout". There's no need to do an "fopen()" on them -- they can just be assigned to a file pointer:

   fpin = stdin;
   fpout = stdout;

-- and any following file-I/O functions won't know the difference.

The "fclose()" function simply closes the file given by its file pointer parameter. It has the syntax:

   fclose( fp );

* The "fseek()" function call allows the byte location in a file to be selected for reading or writing. It has the syntax:

   fseek( <file_pointer>, <offset>, <origin> );

The offset is a "long" and specifies the offset into the file, in bytes. The "origin" is an "int" and is one of three standard values, defined in "stdio.h":

   SEEK_SET    Start of file.
   SEEK_CUR    Current location.
   SEEK_END    End of file.

The "fseek()" function returns 0 on success and non-zero on failure.

The "rewind()", "rename()", and "remove()" functions are straightforward. The "rewind()" function resets an open file back to its beginning. It has the syntax:

   rewind( <file_pointer> );

The "rename()" function changes the name of a file:

   rename( <old_file_name_string>, <new_file_name_string> );

The "remove()" function deletes a file:

   remove( <file_name_string> )

* The "fprintf()" function allows formatted ASCII data output to a file, and has the syntax:

   fprintf( <file pointer>, <string>, <variable list> );

The "fprintf()" function is identical in syntax to "printf()", except for the addition of a file pointer parameter. For example, the "fprintf()" call in this little program:

   /* fprpi.c */

   #include <stdio.h>

   void main()
   {
     int n1 = 16;
     float n2 = 3.141592654f;
     FILE *fp;

     fp = fopen( "data", "w" );
     fprintf( fp, "  %d   %f", n1, n2 ); 
     fclose( fp );
   }

-- stores the following ASCII data:

    16   3.14159

The formatting codes are exactly the same as for "printf()":

   %d    decimal integer
   %ld   long decimal integer
   %c    character
   %s    string
   %e    floating-point number in exponential notation
   %f    floating-point number in decimal notation
   %g    use %e and %f, whichever is shorter
   %u    unsigned decimal integer
   %o    unsigned octal integer
   %x    unsigned hex integer

Field-width specifiers can be used as well. The "fprintf()" function returns the number of characters it dumps to the file, or a negative number if it terminates with an error.

The "fscanf()" function is to "fprintf()" what "scanf()" is to "printf()": it reads ASCII-formatted data into a list of variables. It has the syntax:

   fscanf( <file pointer>, <string>, <variable list> );

However, the "string" contains only format codes, no text, and the "variable list" contains the addresses of the variables, not the variables themselves. For example, the program below reads back the two numbers that were stored with "fprintf()" in the last example:

   /* frdata.c */

   #include <stdio.h>

   void main()
   {
     int n1;
     float n2;
     FILE *fp;

     fp = fopen( "data", "r" );
     fscanf( fp, "%d %f", &n1, &n2 );
     printf( "%d %f", n1, n2 );
     fclose( fp );
   }

The "fscanf()" function uses the same format codes as "fprintf()", with the familiar exceptions:

Numeric modifiers can be used, of course. The "fscanf()" function returns the number of items that it successfully read, or the EOF code, an "int", if it encounters the end of the file or an error.

The following program demonstrates the use of "fprintf()" and "fscanf()":

   /* fprsc.c */

   #include <stdio.h>

   void main()
   {
     int ctr, i[3], n1 = 16, n2 = 256;
     float f[4], n3 = 3.141592654f;
     FILE *fp;

     fp = fopen( "data", "w+" );

     /* Write data in:   decimal integer formats
                         decimal, octal, hex integer formats
                         floating-point formats  */

     fprintf( fp, "%d %10d %-10d \n", n1, n1, n1 );   
     fprintf( fp, "%d %o %x \n", n2, n2, n2 );
     fprintf( fp, "%f %10.10f %e %5.4e \n", n3, n3, n3, n3 );

     /* Rewind file. */

     rewind( fp );

     /* Read back data. */

     puts( "" );
     fscanf( fp, "%d %d %d", &i[0], &i[1], &i[2] );
     printf( "   %d\t%d\t%dn", i[0], i[1], i[2] );
     fscanf( fp, "%d %o %x", &i[0], &i[1], &i[2] );
     printf( "   %d\t%d\t%d\n", i[0], i[1], i[2] );
     fscanf( fp, "%f %f %f %f", &f[0], &f[1], &f[2], &f[3] );
     printf( "   %f\t%f\t%f\t%f\n", f[0], f[1], f[2], f[3] );
     
     fclose( fp );
   }

The program generates the output:

   16         16         16
   256        256        256
   3.141593   3.141593   3.141593   3.141600

* The "fwrite()" and "fread()" functions are used for binary file I/O. The syntax of "fwrite()" is as follows:

   fwrite( <array_pointer>, <element_size>, <count>, <file_pointer> );

The array pointer is of type "void", and so the array can be of any type. The element size and count, which give the number of bytes in each array element and the number of elements in the array, are of type "size_t", which are equivalent to "unsigned int".

The "fread()" function similarly has the syntax:

   fread( <array_pointer>, <element_size>, <count>, <file_pointer> );

The "fread()" function returns the number of items it actually read.

The following program stores an array of data to a file, and then reads it back using "fwrite()" and "fread()":

 
   /* fwrrd.c */

   #include <stdio.h>
   #include <math.h>
   
   #define SIZE 20
   
   void main()
   {
     int n;
     float d[SIZE];
     FILE *fp;
   
     for( n = 0; n < SIZE; ++n )                 /* Fill array with roots. */
     {
       d[n] = (float)sqrt( (double)n );
     }
     fp = fopen( "data", "w+" );                 /* Open file. */
     fwrite( d, sizeof( float ), SIZE, fp );     /* Write it to file. */
     rewind( fp );                               /* Rewind file. */
     fread( d, sizeof( float ), SIZE, fp );      /* Read back data. */
     for( n = 0; n < SIZE; ++n )                 /* Print array. */
     {
       printf( "%d: %7.3f\n", n, d[n] );
     }
     fclose( fp );                               /* Close file. */
   }

* The "putc()" function is used to write a single character to an open file. It has the syntax:

   putc( <character>, <file pointer> );

The "getc()" function similarly gets a single character from an open file. It has the syntax:

   <character variable> = getc( <file pointer> );

The "getc()" function returns "EOF" on error. The console I/O functions "putchar()" and "getchar()" are really only special cases of "putc()" and "getc()" that use standard output and input.

* The "fputs()" function writes a string to a file. It has the syntax:

   fputs( <string / character array>, <file pointer> );

The "fputs()" function will return an EOF value on error. For example:

   fputs( "This is a test", fptr );

The "fgets()" function reads a string of characters from a file. It has the syntax:

   fgets( <string>, <max_string_length>, <file_pointer> );

The "fgets" function reads a string from a file until if finds a newline or grabs <string_length-1> characters. It will return the value NULL on an error.

The following example program simply opens a file and copies it to another file, using "fgets()" and "fputs()":

   /* fcopy.c */

   #include <stdio.h>
   
   #define MAX 256
   
   void main()
   {
     FILE *src, *dst;
     char b[MAX];
   
     /* Try to open source and destination files. */
   
     if ( ( src = fopen( "infile.txt", "r" )) == NULL )
     {
        puts( "Can't open input file." );
        exit();
     }
     if ( (dst = fopen( "outfile.txt", "w" )) == NULL )
     {
        puts( "Can't open output file." );
        fclose( src );
        exit();
     }
   
     /* Copy one file to the next. */
   
     while( ( fgets( b, MAX, src ) ) != NULL )
     {
        fputs( b, dst );
     }
   
     /* All done, close up shop. */
   
     fclose( src );
     fclose( dst );
   }
BACK_TO_TOP

[3.3] C FILE-I/O THROUGH SYSTEM CALLS

* File-I/O through system calls is simpler and operates at a lower level than making calls to the C file-I/O library. There are seven fundamental file-I/O system calls:

   creat()     Create a file for reading or writing.
   open()      Open a file for reading or writing.
   close()     Close a file after reading or writing.
   unlink()    Delete a file.

   write()     Write bytes to file.
   read()      Read bytes from file.

These calls were devised for the UNIX operating system and are not part of the ANSI C spec. Use of these system calls requires a header file named "fcntl.h":

   #include <fcntl.h>

* The "creat()" system call, of course, creates a file. It has the syntax:

   <file descriptor variable> = creat( <filename>, <protection bits> );

This system call returns an integer, called a "file descriptor", which is a number that identifies the file generated by "creat()". This number is used by other system calls in the program to access the file. Should the "creat()" call encounter an error, it will return a file descriptor value of -1.

The "filename" parameter gives the desired filename for the new file. The "permission bits" give the "access rights" to the file. A file has three "permissions" associated with it:

These permissions can be set for three different levels:

For the "creat()" system call, the permissions are expressed in octal, with an octal digit giving the three permission bits for each level of permissions. In octal, the permission settings:

   0644

-- grant read and write permissions for the user, but only read permissions for group and system. The following octal number gives all permissions to everyone:

   0777

An attempt to "creat()" an existing file (for which the program has write permission) will not return an error. It will instead wipe the contents of the file and return a file descriptor for it.

For example, to create a file named "data" with read and write permission for everyone on the system would require the following statements:

   #define RD_WR 0666
   ...
   int fd;                               /* Define file descriptor. */
   fd = creat( "data", RD_WR );

The "open()" system call opens an existing file for reading or writing. It has the syntax:

   <file descriptor variable> = open( <filename>, <access mode> );

The "open()" call is similar to the "creat()" call in that it returns a file descriptor for the given file, and returns a file descriptor of -1 if it encounters an error. However, the second parameter is an "access mode", not a permission code. There are three modes (defined in the "fcntl.h" header file):

   O_RDONLY    Open for reading only.
   O_WRONLY    Open for writing only.
   O_RDWR      Open for reading and writing.

For example, to open "data" for writing, assuming that the file had been created by another program, the following statements would be used:

   int fd;
   fd = open( "data", O_WRONLY );

A few additional comments before proceeding:

The "close()" system call is very simple. All it does is "close()" an open file when there is no further need to access it. The "close()" system call has the syntax:

   close( <file descriptor> );

The "close()" call returns a value of 0 if it succeeds, and returns -1 if it encounters an error.

The "unlink()" system call deletes a file. It has the syntax:

   unlink( <file_name_string> );

It returns 0 on success and -1 on failure.

* The "write()" system call writes data from a open file. It has the syntax:

   write( <file descriptor>, <buffer>, <buffer length> );

The file descriptor is returned by a "creat()" or "open()" system call. The "buffer" is a pointer to a variable or an array that contains the data; and the "buffer length" gives the number of bytes to be written into the file.

While different data types may have different byte lengths on different systems, the "sizeof()" statement can be used to provide the proper buffer length in bytes. A "write()" call could be specified as follows:

   float array[10];
   ...
   write( fd, array, sizeof( array ) );

The "write()" function returns the number of bytes it actually writes. It will return -1 on an error.

The "read()" system call reads data from a open file. Its syntax is exactly the same as that of the "write()" call:

   read( <file descriptor>, <buffer>, <buffer length> );

The "read()" function returns the number of bytes it actually returns. At the end of file it returns 0, or returns -1 on error.

BACK_TO_TOP
< PREV | NEXT > | INDEX | SITEMAP | GOOGLE | LINKS | UPDATES | BLOG | EMAIL | $Donate? | HOME