Arrays is the way C uses of storing a number of entities of the same type and accessing them easily. They are implemented through the use of pointers: pointers are special variables which store not a value, but the location where the value is stored.
The manipulation of pointers is a key feature of C, and it is important to understand their use. Although a reasonably simple but very powerful concept, it is nonetheless possible to write programs using pointers which are very difficult to understand, so it is particularly important to develop a good style when dealing with them.
An array is a way of collecting together a number of entities of the same type to form a single variable. The entities thus collected are referred to as elements. Having declared an array, it is possible to refer to one of the elements via its index, or its order-of-appearance within the array.
The index of an array can be a constant, e.g. a[10]=3; or x=f[12];, or, crucially, any integer expression, so with a suitable loop, one might set alternate elements to +1 and -1 by enclosing the statements a[2*i]=1; a[2*i+1]=-1; within a suitable loop.
In the C language, and array of n elements has a range of indices starting at 0 and up to n-1, as shown in the inset to the left.
The general form of a array declaration is as follows:
The array a in the example above therefore has elements numbered 0 to 9, and not from 1 to 10 as in some other languages. 0 and 9 are referred to as the bounds of the array
C does not check in any way that the index of the array is in bounds: this is left up to the programmer. Reading from or writing to an element which is out of bounds will almost certainly have undesirable results. It may corrupt the values of other variables, or it may simply cause the program to crash with a segmentation fault.
The zero origin used in array indexing has its roots in the way the C compiler allocates storage.
The array elements are stored in a contiguous memory block, starting with the beginning of the array at the lowest address, progressing higher in memory as the array index increases.
The 1-D integer array a will have a start at an arbitrary available address assigned by the linker, say for example 0x2000. If the size of an integer is four bytes, the address of each of its elements will be as shown in the inset on the right. There are two important points to note:
Arrays may be of storage class automatic, external or static, but not register. ISO standard C allows both automatic and static arrays to be initialised on declaration. The syntax to achieve this is:
If no initialisation list is provided, the array is filled with zeros only if it is static! Automatic arrays are initially filled with undefined values. Using these values before initialisation will cause the program to behave unpredictably.
Passing a whole array to a function would potentially result in a huge inefficiency. When the function is called, a local copy of the actual parameter has to be made, and this could require very large quantities of data to be duplicated where the actual parameter is an array variable.
Consequently, C always passes an array to a function by passing the address of its first element. This is very fast, but has the unfortunate side effect of destroying the locality of the array elements: when they are changed, the corresponding elements in the array in the calling context will change also.
Pointer variables or constants store a memory address, usually the memory address of another variable. Pointers are often used to provide a mechanism for passing large quantities of data, such as arrays or large structures, to and from functions.
Pointers are declared by preceding the variable name with a '*' character. This gives rise to no end of confusion, because when appearing within the body of a program, '*' is pronounced "Contents of memory pointed at by...", where it is the opposite of '&' (pronounced "address of..."). In declaration, '*' is pronouced "is a pointer to" and is read right-to-left. So int *p; is pronounced "p is a pointer to an integer", but *p * 2; is pronounced "contents of address pointed to by p, times two"
The above program demonstrates pointers in use. Note that, in general, we don't care about where exactly a pointer variable points, but only about the thing it is pointing at.
As well as pointing at entities of the declared type, all pointers can be set to the special value NULL which means they are pointing at nothing. Note that this is very different from not assigning a value to a pointer variable at all: uninitialised variables might contain nonsense which will mean they are pointing at random locations in memory and cause all sorts of trouble (usually segmentation faults).
The code below illustrates two very common mistakes.
The right-hand-side shows the declaration of a pointer variable which is uninitialised. Just like all uninitialised automatic variables, this pointer contains a random value. Even experienced programmers occasionally forget to reserve the memory the pointer is supposed to reference, and simply write the subsequent assignment statement. This usually causes the program to crash instantly, because writing to the random address generates a segmentation fault.
Of these two bugs, the first is the most insidious. The error will be reported much later than the offending pointer variable initialisation, and will not show up until dereferencing is attempted, possibly even in a different source code file. The only way to track down this sort of bug is to use the debugger. The value of the pointer variable can then be examined just before it is dereferenced. This sometimes throws some light on the true source of the problem.
Pointers and arrays are intimately related. When an array is declared, memory sufficient to store the specified number of elements is reserved. The name of the array by itself is synonymous with the address of the first element. The name of the array without the trailing [] behaves exactly like a pointer constant which points at the first element of the array. It is not allowed to change its value, even if it still points within the same array.
Both of the above code fragments double all of the elements in the integer array. The one on the left uses array indexing, and the one on the right uses pointer arithmetic. The version with using pointer arithmetic looks more long-winded than the one using array indexing, but this is because the size of the array was known in advance and was a constant. Pointer-based methods really come into their own when this isn't the case, as shown in the string copying example illustrated below.
Since C represents a string as an array of characters, the last of which has the special ASCII-null value '\0' (not the same as NULL!), a common method of processing a string is to set a pointer to its first element, and continue incrementing the pointer until the final null character is reached. This is easily achieved by enclosing the desired processing in a while statement, as shown below:

If pointers simply hold an address, why is it necessary for the compiler to distinguish between pointers to different types? Why is a pointer to a character fundamentally different to a pointer to an integer?
There are two answers. The first has to do with basic program semantics. If xp is ip are both pointers, what machine instructions should the compiler generate to evaluate *xp/*ip? It is impossible to tell. But if we know that xp is a pointer to a double, and ip is a pointer to an integer, then we can state the answer: a double precision number has to be fetched from memory address xp; an integer has to be fetched from memory address ip and converted to a double; the first double is then to be divided by the second.
The second, equally important, answer is that knowing the referenced type facilitates pointer arithmetic.
In the above example, how does the compiler go about assigning p and c? In C, these two assignment statements mean p is assigned the value stored at address one double after the one xp is pointing at and c is assigned the character stored at the address two characters after the one s is pointing at.
These statements are exactly equivalent to p=xp[1]; c=s[2];. In order to generate the correct code, the compiler has to be able to calculate the address of xp[1] and s[2], and therefore has to be told the size of the entity the pointer is referencing. Taking into account the size of the referenced entity in the performance of these calculations is the key feature of pointer arithmetic.
The equivalence of the square bracket and * form of dereferencing raises the stylistic question of whether it is better to write *p or p[0] (p being a pointer), and there is no correct answer. Modern compilers will not produce different code, so as in so many other similar circumstances, the one should be chosen which best elucidates the operation of the code.
Pointer arithmetic is very convenient for the C programmer. If pt is a pointer to thing, pt++; increments pt so that it points to the next thing and pt+=5; moves pt so that it points 5 things further down memory. This makes it very easy to write routines like the string copying one above which operate on arrays of any type.

Because of the very heavy dependence of C on pointers and the frequent use of the dereference (*, "contents of") and reference (&, "address of") operators, it is vitally important to become fluent in the use of these types.
You can declare an array of pointers as easily as an array of any other type. int *a[100]; -- there! a is an array of 100 pointers to integers. a[1] is the second pointer, and *a[1] is value at which it points. Don't forget that declaring an array of pointers to integers doesn't declare the integers themselves, so before you use *a[1] you must ensure a[1] has been set to point somewhere sensible (see Common Pitfalls above).
Pointers to arrays are easily obtained: recall that the name of an array without its [] is synonymous with the address of its first element. However, such an identifier represents a pointer constant, and declaring a pointer variable which points at an array is a little harder.
The idea behind this code is that filladdr can be passed a pointer to an array of known size later in the program, and thus used to operate on one or the other of the strings. This is a rather contrived example, because we could have got away with declaring the parameter of filladdr as simply a pointer to a character, but this would fail when used with multidimensional arrays (arrays of arrays) as explained below. In fact, genuine pointers to arrays (declared with the form int *(iptr[SIZE]); (note the parentheses) are rarely seen outside of the multidimensional context.

A pointer variable holds the address of another variable, and there is no reason that the address it holds cannot be that of another pointer variable. If i is an integer, and we declare int *ip = &i to be a pointer to it, there is no reason not to be able to declare a pointer to this variable int **ipp = &ip;. ipp is a pointer to a pointer to an integer, so *ipp is the pointer to an integer ip, and **ipp is the value stored in i itself.
There is no limit to the number of indirections which can be achieved, although clearly readability of the code suffers dramatically as the number of stars increases. The inset on the right shows a very extreme example with a variable int ****ip4; -- ip4 is a pointer to a pointer to a pointer to a pointer to an integer (count the arrows in the diagram).


All variables occupy an address in the machines memory, so it is possible to have them pointed at by pointer variables. However, functions also occupy unique memory locations in the computers program memory, so it is possible to store the address of a function in a pointer variable too.
This is a rarely used, but very, very powerful technique. It permits the construction of what computer scientists call first class functions; that is, functions which can themselves take other function s as an argument.
An example of such a function is the qsort function from the standard library. Quoting the qsort manual page (excerpted from the Linux Programmer's Manual):
Using this technique, it is possible to write a function which sorts an array of things without having to know how to compare one thing with another. Instead, the programmer is expected to write a function which when passed two pointers to things, returns an integer which is used for the comparison.
NAME
qsort - sorts an arraySYNOPSIS
#include <stdlib.h>
void qsort(void *base, size_t nmemb, size_t size,
int(*compar)(const void *, const void *));DESCRIPTION
The qsort() function sorts an array with nmemb elements of size size. The base argument points to the start of the array. The contents of the array are sorted in ascending order according to a comparison function pointed to by compar, which is called with two arguments that point to the objects being compared. The comparison function must return an integer less than, equal to, or greater than zero if the first argument is considered to be respectively less than, equal to, or greater than the second. If two members compare as equal, their order in the sorted array is undefined.
The address of a function is taken in the same way as the address of an array: you just omit the brackets. The following example shows how a function called map which does something to all of the elements of an array of double precision floating point numbers. The something in this example is provided by the function half which returns its argument divided by two.
The important points to note are that the address of the function is obtained simply by using the name of the function without the (), and how to declare the function map so that it works properly.
The final argument of map is declared like this: double (*action)(double). This is read action is a pointer to a function which takes one double as an argument and returns a double. The parentheses around *action are important; the * binds less tightly than (), so double *action(double) means action is a function taking one double as an argument and returning a pointer to a double which is meaningless in this context.
The same applies in the function call. The * in (*action) changes action's meaning from a pointer to a function to a function and then calls this function passing it the double at which array points. Omitting the parentheses and typing *action(*array); means call action, expecting it to return a pointer to a double, then take the value pointed at by this pointer. Since there's no function called action, this is also meaningless in this context.
Arrays of arrays are useful for storing two-dimensional data, such as the characters that appear on a computer terminal. While some languages permit multi-dimensional arrays explicitly (along the lines of array[i,j]), C extends the concept of single dimensional arrays of things. An array of arrays is essentially a 2-d array, so the characters on a computer terminal might be stored in an array declared something like char screen[25][80]; for 25 lines each of 80 characters. Why not screen[80][25]? Well, no reason really, but the interpretation of the former is that screen is an array of 25 things and that each thing is an array of 80 characters. Thus the former declaration is more intuitive, at least to Western Europeans, who consider the screen to be a collection of lines, each of which is a collection of characters. Cultures who write text vertically might think of it the other way around
Because of the "array of array" idea, the number of dimensions can be increased arbitrarily. A computer's graphics display might represent each pixel (dot) on the screen with three integers, one for each of the red, green and blue colour values. For a 1280x1024 display, the appropriate declaration might be int pix[1024][1280][3];.
Multidimensional arrays can be initialised like this:
When initialising, it is important to read the dimensions "backwards": the first subscript changes the least frequently with progression through memory. So the above initialisation is consistent: try is an array of two lots of three lots of four integers.
Recalling that for pointer arithmetic to work, the size of an entity must be known to the compiler, it will be seen that it is important to indicate the extent of the multidimensional array when passing its reference to a function.
In fact, the extent of the last dimension may always be omitted. Suppose, for example, that a ten-by-ten square array was declared int sqary[10][10] and that this array was then passed to a function. Within the function, if element [i][j] were to be accessed, the address of the required variable could not be determined without knowledge of the extent of the first dimension. Since this is 10 in our example, we are able to say that such an access would be equivalent to writing *(sqary+10*i+j).
In order to pass multi-dimensional arrays to a function, there are two options:
In the variable size case, the code can be made to work for various sizes of array, but accessing the array is more cumbersome. In the fixed-size case, accessing the array is easy and intuitive, but the dimensions of the array passed to the function must agree exactly with those stated. For reasons already explained, it is permitted to omit the last dimension in the function declaration, but if it is stated, the calling array must be exactly the same size to avoid a compile-time error.
Arrays are indexed collections of entities all of the same type.
Pointers are absolutely central to C programming, and they are intimately connected with arrays. A pointer can point at any entity which matches its type declaration, which can be a simple type, another pointer, or even a function.
The name of an array is a pointer constant pointing to its first element, and the name of a function is a pointer constant pointing to the function's code.
Arrays of higher dimensions are easily declared and used in C, although in order to pass multi-dimensional arrays around a program, care must be taken that types agree. The alternative is to resort to pointer-arithmetical solutions, where care must be taken to understand the order of storage of elements in the target array.
The tools upon which this course relies are Copyright the Free Software Foundataion where they are made available under the GPL (GNU Public Licence).
The content of this course was derived from that generated by many ex-colleagues at the University of Leeds, Department of Electronics and Electrical Engineering. Much of the content has been reworked, and substantially augmented, but Dr N J Bailey, Centre for Music Technology, The University of Glagsow. This manifestation is Copyright N J Bailey; some of the content is Copyright The University of Leeds.
Diagrams on this resource are drawn in XFig and are rendered by the browser using The University of Hamburg's Simple FIG viewer applet which is Copyright (C) 1996-2002 F.N.Hendrich, hendrich@informatik.uni-hamburg.de.
The source code, programming examples and exercises are all specific to this course, and are Copyright, Dr N J Bailey.
The applet for viewing and demonstrating C programs is Copyright Dr N J Bailey, and is to be found documented and with its source code on the Centre for Music Technology website under Software