Back to main

Functions and Libraries

All practical C programs achieve (considerable) complexity by breaking the problem down into smaller pieces. Code is then written to reach each subgoal, independently tested, and combined to form a complex system. Functions are the method by which this achieved.

Libraries are a collection of often-used functions which are grouped together in a single place. Over a time, some of the libraries have become standardised to perform tasks all programs require, such as input from the keyboard and output to the terminal, and these are distributed with the compiler as "standard libraries".

As the program increases in complexity, the same variable name may be used more than once for a different purpose. The idea of scope means this can be achieved without conflict, as a variable can be made to "last" only for the part of the program which needs it.

Function Invocation

One of the most powerful concepts of the C language, and indeed of any computer language, is that of a function. If a program were written as a single, large piece of code, it would very quickly become unruly, too complicated, and impossible to understand. In order that the program can be developed in the first place, or maintained after it has been written, it is necessary to find some mechanism to break it down in to manageable sized chunks. To this end, the code is split into functions.

Functions are sub-programs designed to form a single specific task. The task should in some sense be atomic, and general. This way a properly written function can be used in a variety of places, cutting down the number which have to be written, and increasing reliability of the program as a whole.

Some programmers think that "If a function won't fit on the screen, it should be split into several smaller functions". This is a laudable aim, but the fact is that some algorithms simply take a lot of space to express, so there is no "maximum length" for a function any more than there is a maximum length for a novel or a newspaper article. Instead, having designed or written a function, one should ask oneself whether it could not be expressed more generally, consisely, or in smaller elements. A better aphorism than the above would be, "Inside every large program [or function] there's a small one fighting to get out".

All C programs are themselves a function. When a C program runs, some initialisation is performed and then the function called main is called.

A further advantage of functions is that they can be grouped together into libraries. In loose terms, one might think of a library as a program without a main function. It is a collection of subroutines which can be made available to other programs. Functions are grouped according to utility to form, for example, a "math library", or "sound library", or "mpeg library". Be default, all programs are linked against the "standard C library", which provides standard functions we have been using already like puts and printf.

A function call is an expression that passes control and possibly some arguments to a function. A simple example might be a function to find the square root of a number. More complicated functions could perform some statistical analysis on a set of data or find the currents and voltages of an electric circuit. The general form of a function call is:

A Function Call

expression(argument-list);
where expression is usually just the name of the function, but might be any valid C expression of type pointer-to-function (this will be delt with in much more depth in the section on pointers). The argument-list is a comma-separated list of expressions. The values of these expressions evaluated and made available to the function.


The expressions in the function argument list can be evaluated in any order, so arguments whose values may be changed by side effects from another argument have undefined values. Consequently, one should not write calls like myfunc(2*x, x++). The function-call operator guarantees only that all side effects in the argument list are evaluated before control passes to the called function, so myfunc(x++, y++) would guarantee to increment x and y before passing their original values to myfunc, but would make no guarantees about which gets incremented first.


The following example demonstrates a simple way of calling a function supplied with the C mathematics library, "m". ("m" is the name of the library, not another programming language!)

#include <stdio.h>
#include <math.h>

int main()
{
  double theta = 60.0 * M_PI / 180.0;
  double sinTheta = sin(theta);

  printf("%f\n", sinTheta);
  return 0; }
This example, although very simple, makes the following important points about functions: Let's see that in action. The first attempt to compile omits the "-lm" flag.

Calling a Function

Forgetting the Library

$ gcc sintheta.c
/tmp/cc2U6lKj.o(.text+0x28): In function `main':
: undefined reference to `sin'
collect2: ld returned 1 exit status

Correct compilation, and execution.

$ gcc sintheta.c -lm
$ ./a.out
0.866025

Defining a Function

When writing a program, one very quickly needs to define one's own functions instead of relying solely on those put into libraries by other programmers. A function definition contains all of the declarations and statements required to define the functions action. Here is its general form:

General Form of a Function Definition

type-specifier function-name(formal-parameter-list)
compound-statement
The type-specifier indicates the type of the result of calling the function. The value returned is an argument to the return statement, which unless the type-specifier is void, must appear within the compound statement. The formal-parameter-list is a comma-separated list of arguments taking the same form as normal variable declarations. Recall that variables may be declared at the beginning of any compound statement, and the one providing the body of the function is no exception. In addition to the variables thus declared, those declared in the formal-parameter-list are available to the function.

Here is a more practical example: a function which returns the product of two integers as a long integer value:

long product (int i1, int i2)  /* No semicolon here!! */
{
  /* Function Body. No variables required,
     so straight on with the code */
  return (long)i1 * (long)i2; /* Hardly a taxing function! */
}


The inline Modifer

In the section on data types, the concept of a register variable was introduced. The aim was to increase the speed of program execution by hinting to the compiler that it would be a great idea to store a value of a particular variable in an on-chip register. This turnes out to be a bad idea, because the compiler usually does a better job than a human at deciding which system resources to use, and for what.

Each time a function is called, there is an overhead associated. Values have to be copied into the parameters of the function, and a jump is made to the beginning of the function in memory. At the end of the function, a return jump has to be made to the place where the function call originated. For very short functions such as product above, this process will take longer than running the function body itself!

Preceding the declaration of product with the keyword inline hints to the compiler that instead of making a genuine function call, the code can be written in the program directly at that point. So if two calls to the function are made, the code is duplicated; if ten calls are made, the code appears ten times. This has the effect of making the final program larger, but faster (one hopes!).

It is a bad idea to use inline for the same reason that it is a bad idea to use register. If the compiler thinks it is best to make a function inline rather than to use the normal method of calling it, it will do so. You can modify its behaviour by giving command line options: gcc supports at least -finline-functions, -finline-limit=n, -fkeep-inline-functions, -fno-default-inline and -fno-inline. The best thing to do is probably just ignore all of these, and let the compiler do as it thinks fit.

The Function Prototype

The function prototype tells the compiler the form of a function before it is called. Specifically, it informs the compiler of the function's name, return type, and type of each of its parameters. The form of a function prototype is just like the declaration, except that the compound statement is replaced by a null statement (i.e. you leave out the code and put a semicolon on the end).

The prototype for product declared above would look like this:

Prototype of product

long product(int, int);
Note that the names of the parameters are entirely unimportant. The compiler takes no notice of the names of the formal parameters until the function declaration is read. It is only interested in knowing in advance that when calling the function, two integers will be needed, and the result will be a long integer. If later in the source code, product is called with the wrong number of arguments, the compiler is in a position to issue a message saying what the problem is.

Although, as indicated above, it is in order to omit the name of the formal parameters in a function prototype, it is quite legal to include them. Some programmers perfer to do this as an aid to readability; it reminds them when using the function what each paramter is for. Whether or not to include the formal parameter names in the function prototype is purely a matter of taste.

Function Prototypes and Header Files

In the section on the C preprocessor, we saw how a preprocessor directive #include could be used to include a header file. Prototypes are particularly useful in conjunction with header files.

Suppose a program needs to be broken up into several source code files for ease of maintenance. One of the files, say product.c, contains the code for the product function defined above. Conventionally, the prototype for this function will be placed in the file product.h. A second file, main.c contains the main program which calls product. The situation is now as shown below.

The Product Program's Source Files


File: product.c

#include "product.h"

long product (int i1, int i2)
{
  return (long)i1 * (long)i2;
}

File: main.c

#include <stdio.h>
#include "product.h"

int main()
{
  int i, j;
  printf("Two ints, please> ");
  scanf("%d%d", &i, &j);
  printf("%d x %d = %ld",
         i, j, product(i, j));
  return 0;
}

Having written these two files, one could compile them "by hand" or write a makefile. Since there are only two files, and writing makefiles is outside the scope of this course, we'll do it by hand.
$ gcc -c product.c
$ gcc -c main.c
$ gcc -o product main.o product.o
$ ./product
Two ints, please> 3 4
3 x 4 = 12
$
The -c flag tells the compiler to produce an object file, but not to attempt to link it at this stage. The linkage is finally achieved by passing the object files to the compiler, and asking it to produce its output in a file called product

Here, the prototype for the product function is read into main.c by including product.h. This is fair enough, but why is it also read into the other file, product.c, which contains the actual source code for the subroutine? The answer is to do with reliability. Suppose another programmer wanted to change the function so that it would operate on double-precision floating point numbers, i.e. it would have the prototype double product(double, double);. The programmer then forgets to change the header file having made the changes only to the file containing the source code. When attempting to compile product.c in the procedure above, there will be an error to the effect that the declaration of product does not match its prototype. If the programmer hadn't included the file product.h from within the file product.c, the compilation and linkage would have been successful, but the result would have been nonsense, with integer values being confused with floating-point representation in the main program.

Return Values

The return statement terminates the function, returning control to the calling context at the next statement immediately after the function call. There may be several return statements in a function, for example if various sections of code are executed depending on different circumstances. However, this can sometimes obfuscate the meaning of the code, so don't abuse this privilidge. In the case of void functions, there will be no returned value.

The value of the return statements argument is converted to the type in the function delcaration before it is returned. It is an error to write an expression where no such type conversion is available automatically. Any primitive data type or structured data type can be returned, or a pointer. The use of pointers in passing data to and from a function is an important topic, and will be delt with in detail later.


It is possible to return a structured data type like a structure or union from a function, but this should usually be avoided. The reason is that the data type in question has no fixed size, and may be very large. This results in a potentially large overhead as the data forming the result of the function call is copied into the calling context.

It is better to use pointers to pass references to large quantities of data. This is the only way of passing arrays to or from functions.


Implicit Declaration Problems

Take a look at the following code, bearing in mind that the function prototype has been commented out:

#include <stdio.h>

/* double twice(double); */

int main()
{
  printf("%f\n", twice(6));
}

double twice(double x)
{
  return 2.0*x;
}
Compiling and running this code gives the following result:
$ gcc implicit.c
implicit.c:11: warning: type mismatch with previous implicit declaration
implicit.c:7: warning: previous implicit declaration of `twice'
implicit.c:11: warning: `twice' was previously implicitly declared to return `int'
$ ./a.out
-1.998619
What has gone wrong?

The reason for this is that C is a single-pass compiler. That is to say, it only reads the source code once, producing object code as it does so. When it reaches the function call on line 7, the compiler has not yet read the definition of the function twice. The rules of C programming say that if a function has not yet been defined, it should be implicitly expected to return an integer. This was probably a mistake, but that's the law as defined by the authors of the programming language in the 1970s, and it is too late to change it now. Consequently, code is generated which expects the twice function to return an integer. This "integer" is converted into floating point and printed out.

When the compiler reads further, it discovers that twice is in fact going to return a floating-point value. It can't go back and change the code it produced, but instead emits the three-line remorseful warning shown above. The compiler is saying, "The code generated by the statement at line 7 was done on the understanding that the function call was going to result in an integer value. Now it turns out, at line 11, that isn't the case, so that code I wrote for you before is probably not going to give the correct results". Indeed it doesn't: the bit patterns which happen to express the correct answer as a floating point number are misinterpreted as an integer, which is then converted to a floating point number and printed. The result is nonsense.

If the comments are removed from the prototype, the warnings disappear and the correct answer is displayed. The compiler has been warned of the return type of twice before the function was used. Note that if the programmer had followed good practice and had included the protype in a header file included at the top of the source code, the problem would never have arisen.

Formal Parameters

Formal parameters are variables that receive values passed to the function when it is called. We have seen already how a function's formal parameter list is declared via its function prototype. The compiler uses this list to ensure that the correct types are delivered to a function each time it is called.

Inside the function, the formal parameters are then used just like any other local variable. Changing the value of a parameter within the function does not change the value of a variable outwith the function which was used as a parameter in the function call. When the function is called, a copy of all the required values is made, and it is the copy which is made available to the C code within the function. We say that the parameter behaves like a local variable, or that its scope is confined to the function.


Formal parameters of main

The order and meaning of the parameters of a function are normally obvious: usually, the programmer will have written the function and know what the parameters are for. If the function is part of a library there will be documentation, or at least a header file, to consult. The only function written by the programmer which is called "out of the blue" is main function which the system calls when the program starts to run.

So far, we have tacetly ignored the arguments of the main function, but in fact it is also allowed to take 2. ISO C permits main to be implemented either with no parameters at all (as we have seen so far) or with two arguments as follows:

Parsing Command-line Arguments

#include <stdio.h>

int main (int argc, char *argv[])
{
  int i;
  for (i=0; i++, argc--; argc)
    printf("Argument %d: %s\n", i, *argv++);
}

argc is the argument count and indicates how many arguments were typed on the command line. If the program was run from the this way, argc will always be at least 1, and the name of the program will be the first string in the array argv. The rest of the array argv contains the other fields from the command line. Usually this means the words appearing on the command line after the command, one per array element, although this might change depending upon exactly what was typed in to the shell. Strings surrounded by quotation marks "like this" count as a single word, for example. When the above program is compiled to the file a.out and the command, it will run as follows:

$ ./a.out hello there "Nick Bailey"
Argument 1: ./a.out
Argument 2: hello
Argument 3: there
Argument 4: Nick Bailey
$

Complex programs usually have complex rules about how parameters can be passed, and to help with this there is a function called getopt (defined in unistd.h) which will help considerably. There is also a more complex library routine with similar functionality called argp_parse which copes with almost any eventuality.


Variadic Functions

There is nothing unexpected about the vast majority of formal parameter lists when it comes to function declarations, but occasionally one comes across a situation in which it would be most convenient if a function call could have a variable number of arguments. One such example is the printf library function, which can be used in a variety of ways: printf("Name: %s; Age %d", nm, age); takes three arguments; printf("Result is: %d\n", answer); takes only two. How then to declare its function prototype?

C provides a set of macros to deal with functions which have a variable number of arguments, of variadic functions. Consider the following code.

#include <stdio.h>
#include <stdarg.h>

void myfunc(int x, ...)
{
  va_list args;

  printf("There are %d arguments.\n", x);
  va_start(args, x); /* 1 required argument, it's name is x */
  while (x--)
    printf("%d\t", va_arg(args, int));
  printf("\n");
  va_end(args);
}

int main()
{
  myfunc(3, 10, 20, 30);
  return 0;
}
When run, it prints out on the screen:
There are 3 arguments.
10      20      30
The macros are defined in stdargs.h. The prototype of the function with a variable number of arguments declares all of the madatory ones as normal, and then has ... in place of the variable list. The variadic function is required to declare a special variable of type va_list to maintain the reqired information when reading the arguments. The macro va_start is then expanded, naming the last mandatory argument. After this, each use of va_args returns a value of the specified type. Before exiting, it is important that the function expands the va_end macro.

va_lists can be very useful in writing general-purpose functions, but are easily abused. Before using them, one should consider whether it isn't better to use a structured type like a list, and write functions which accept a single argument. As in many cases when programming, it is a matter of choosing the solution which most elucidates the implementation.


Actual Parameter Lists

Actual parameters are the values that are passed to the function by its caller. The value which is passed is calculated, and if it is a variable, stored outside the function and remains distinct from the function's formal parameters. The actual parameters are copied to the variables which are supplied by the function definition.

Part of the overhead assocaited with a function call arises from the fact that the value of the actual parameter is copied to the function's formal parameter when the function call is made. However, the act of copying has overwhelming advantages: it means that it is quite legal to operate on parameters of the function which are constant outside:

Locality of Functions' Formal Parameters


It is illegal to write:

...
2 *= 10;
printf("%d\n", 2);
...

It is legal to write:

...
print10(2);
...
void print10(int num)
{
  num *= 10;
  printf("%d\n", num);
}

The example on the right is clearly nonsense, because 2 is a constant and can't be changed to 20, but when expressed as a function, the value of the actual parameter (2) is passed (i.e. copied) to the formal parameter num of the function print10, whereafter it is possible to modify it just like any other variable.

We say that the parameter num is local to print10, and that 2 has been passed by value to print10.


Some other languages permit an alternative method of transferring the actual parameters from the caller to the called: pass-by-reference. Instead of the value contained by a variable, or the result of an expression, the address in memory where the variable is stored is passed. Such an address called the variable's reference.

This might seem a subtle difference at first sight, but in fact the change in behaviour is profound. If C used pass-by-reference, changes made to a formal parameter within a function would change the variable associated with the actual parameter in the caller. Sometimes, this might be an advantage: for example, scanf, a function used to read input from the user, needs to modify the value of a variable in the caller's context.

C only supports pass-by-value, so functions like scanf expect the programmer to call them by writing an expression which calculates the address of the variable it will set. In the C language, this is a simple matter of putting the reference ("address-of") operator in front of the variable, which is why we see calls like scanf("%d", &i); (if i is an integer). In fact, if scanf("%d", i); had been written, the integer read from the user would have been stored somewhere, but probably not in the variable i! Unless, by happy accident, i happened to contain its own address, the value would be written at some random location in memory, the result probably being that the program simply crashes.


Variable Scope

An important concept which has so far been glossed over is that of variable scope. The scope of a variable determines which part of the program can access it. We have already stated that the formal parameters of a function are local to that function, but it is possible to reduce the scope of variables further, making them local to a particular compound statement. Consider the following program:

More than one variable of the same name can declared in the same program, and C follows the scoping rules to determine which variable is being referenced.

External Variable Example


file2.c

external int globalVar;
...
/* further variable and function declarations all of which may use globalVar. It's initial value is 0, as declared in file1.c */
...

file1.c

int globalVar = 0;
...
/* globalVar is available throughout this file, and to all other files making up the same program which declare it external (e.g. file2.c) */
...

Summary:

Functions form an important part of any programming langauge, because they permit a problem to be broken up into small, easily solved and independently testable parts. C implements pass-by-value function calls; pass-by-reference is achieved by pre-evaluation of the address of a variable using the reference operator "&".

Calling a function causes the actual parameters supplied to be copied to the function's formal parameter, type conversion being accomplished in accordance with the rules of type promotion and with reference to the function's prototype. The prototypes of functions declared in a file are normally collected together into an appropriate header file which is included both in the current file and any file where any of the functions are called.

The idea of local copies of a variable which lasts for the duration of the function's execution introduces the concept of variable scope, a further powerful organisational tool for dealing with the huge complexity of large software systems.

System Requirements

To view this web resource, you will need:

Copyright and Acknowledgements

The tools upon which this course relies are Copyright the Free Software Foundataion where they are made available under the GPL (GNU Public Licence).

The content of this course was derived from that generated by many ex-colleagues at the University of Leeds, Department of Electronics and Electrical Engineering. Much of the content has been reworked, and substantially augmented, but Dr N J Bailey, Centre for Music Technology, The University of Glagsow. This manifestation is Copyright N J Bailey; some of the content is Copyright The University of Leeds.

Diagrams on this resource are drawn in XFig and are rendered by the browser using The University of Hamburg's Simple FIG viewer applet which is Copyright (C) 1996-2002 F.N.Hendrich, hendrich@informatik.uni-hamburg.de.

The source code, programming examples and exercises are all specific to this course, and are Copyright, Dr N J Bailey.

The applet for viewing and demonstrating C programs is Copyright Dr N J Bailey, and is to be found documented and with its source code on the Centre for Music Technology website under Software