Back to main

Structured Data Types

Structures provide a mechanism of collecting together objects of different types. Unions are methods of causing entities of different types to share the same memory location, permitting the same memory to be used for different purposes depending upon the programming context.

Combining structures with pointers is a particularly powerful and frequently-used programming tool. Shorthand syntax is available to increase readability and utility under such circumstances.

Structures

Arrays allow C to store an ordered collection of entities of the same type, but when there is a need to store together entities of different types, it is necessary to use a structure.

Take a look at the street in the figure above. The are seven buildings in the street, and because we are C programmers they have been numbered from 0 to 6. We want to store information about each of the buildings (house name, number of widows, whether joined to adjacent buildings etc) collected together in an object, but we can't use an array because the information is of differing types. A structure provides just this mechanism to aggregate variables of diverse types. Before using a structure, it is necessary to provide the compiler with a structure template which states the types of each of the structure's members. The general form of a structure template is:

A Structure Template

struct optional_structure_tag {
  type_declarator member1_id;
  type_declarator member2_id;
  ...
} optional_variable_list;
The structure tag is a name for the structure which can be used in declaring variables of that type. So we might declare structures called h1, h2 and h3 of type struct House as follows:
struct House {
  int attached;
  char *houseName;
  int windows;
} h1, h2;

struct House h3;

Note that once the template has been declared, it can be used in subsequent variable declarations. For this reason, templates without the optional variable lists often appear in header files where they can be included and used in source files. For example, the file pointer FILE type referred to in the section on standard I/O is really a pointer to a structure which is declared in the system header file stdio.h.

Accessing Structure Members

Like arrays, structures can be initialised when the are declared. Suppose there are defined ATTACHED_LEFT and ATTACHED_RIGHT to be 1 and 2 respectively. To make a structure describing building number 4, one might write:

struct House h4 = {ATTACHED_LEFT | ATTACHED_RIGHT,
                   "Bide-a-wee",
                   2};

Of course, the order and type of the constants in the initialiser has to agree exactly with the structure template.

So much for accessing a structure en masse, but it is of course more usually necessary to access them one member at a time. The member operator . is used to achieve this. Instead of the above, one might have written:

struct House h4;

h4.attached = ATTACHED_LEFT | ATTACHED_RIGHT;
h4.houseName = "Bide-a-wee";
h4.windows = 2;

Nested Structures

It is quite in order for a member of a structure to be an array, another structure, or as will be seen later, a union.

Nested Structures

/* Declare a Road structure with enough storage for 20 Houses */

struct Road {
  char *roadName;
  struct House houses[20];
} r;

The number of windows in the fifth house on this road would be r.houses[4].windows; and to test the house were detached, one would write if (!r.houses[4].attached) ... .

There is one caveat when it comes to nested structure declarations: all structures included in a structure template must have a prior declaration, so it is essential to present structure templates in the correct order:


Legal Structure Declaration

/* The following is legal, because
   struct s2 is defined first. */
struct s2 { int i; };
struct s1 { struct s2 a; };

Illegal Structure Declaration

/* The following is illegal,
   even though it makes sense:
   struct s2 is as yet unknown. */
struct s1 { struct s2 a; };
struct s2 { int i; };

The only exception to the above is that a pointer to a structure type which is not yet declared is allowed. This is because the compiler only has to reserve enough memory for a pointer (usually four bytes, depending on the machine architecture) rather than the memory for an instance of the structure itself.

Forward Pointer Declarations are Allowed

struct s1 { struct s2 *a; }; /* This is fine because a is a pointer to a struct s2 */
struct s2 { int i; };

Unions

Structures aggregate several entities of possibly different types in a single place, unions represent a choice between several different entities. The storage for the entities overlap, and the compiler will generate code which causes sufficient memory for the largest entity to be allocated.

Example Union

/* The following reserves sufficient memory for the largest
   member (probably the double). Since they occupy
   only one of the members can be in use at a time */

union {
  double d;
  int i;
  char c;
}

To illustrate the application of unions, consider a refinement of the struct House data structure already described. The figure of the street of buildings showed some garages as well as houses. It makes sense to record the number of windows in a house, but doesn't really make sense to record the number of windows in a garage. The program might simply ignore the windows member if the structure describes a garage, but it would be more elegant to restructure the data types.

It seems that there are two different sorts of buildings: houses, and garages. A slightly adjusted struct House can be used to describe the former, and a simpler struct Garage to describe the latter. The common attributes can be kept in a new entity, a struct Building wherein the house and garage structures form an anonymous union.

A Building Description Structure

/* First define the different building types */

struct House {
  char *houseName;
  int windows;
};

struct Garage {
  int cars; /* single garage? double? ... */
};

/* Now the main structure: the common attributes and either a
   House *or* a Garage structure */

struct Building {
  /* Common attributes for houses and garages */
  int attached;
  /* Now include either a house or a garage structure */
  int buildingType;
  union {
    struct House house;
    struct Garage garage;
  };
} street[7];

/* Adjacency flags */

#define ATTACHED_LEFT 1
#define ATTACHED_RIGHT 2

/* Building type ID */

#define BUILDING_IS_HOUSE 1
#define BUILDING_IS_GARAGE 2

The above array of buildings could be used to describe the whole of the street. How many windows are there in building number 1? street[1].house.windows. Is the garage to its right attached to it? street[2].attached & ATTACHED_LEFT. How many cars can this garage accommodate? street[2].garage.cars.

The flaw in this plan is that there's no way of telling whether an element of the street array is a house or a garage. If an attempt is made to access the wrong part of the union, the result will be undefined (and almost certainly meaningless). If the context of program requires it, and extra member called buildingType might be provided and set to a particular number according to the type of building which the union is supposed to describe.

Now one can write code which sets the buildingType member, then maybe performs an action appropriate to the building as part of a switch statement. The whole caboodle is assembled in the above panel.

Structured Types and Pointers

Structures can be accessed indirectly via a pointer in the same way that any other variable can.

Structures and Unions via Pointers

struct example {
  int i;
  double x;
} s;

struct example *ep = &s;

...

(*ep).i = 2; /* ep points at s, so *ep is s. This sets member i of s to 2 */
ep->x = 1.5; /* shorthand method, synonymous with (*ep).x = 1.5 */

In the above example, ep is made to point at the structure s. The expression (*ep) evaluates to s and the membership operator . can be used as normal.

Because this is a very clumsy method of accessing a member of a referenced structure, and because it turns out that this method of accessing structures is actually the predominant one (for reasons discussed below), a shorthand is provided to avoid the program becoming cluttered with parentheses and indirection operators. In the above example, -> is used to access the member x of a structure given a pointer to the structure rather than the structure itself.

-> is often pronounced "points at", although this is a slightly unfortunate convention, because that isn't exactly what it means.

In the section on memory, it will be seen that structures very often have pointers to other structures as members; it is occasionally very convenient to be able to write expressions like s1 -> s2 -> x. Here s1 would be a pointer to a stucture with a member s2, also a pointer to a structure. The structure pointed at by s2 has a member called x which is being accessed here. Without the -> shorthand, this expression would be reduced to an almost unreadable mess of punctuation: (*(*s1).s2).x

Note that, because * binds less tightly than ., *ep.x does not mean the same as (*ep).x. The former would only make sense if x were a pointer and a member of ep: then it would mean *(ep.x), which is to say, the entity pointed at by member x of ep.

Passing Structures to and from Functions

A function can be delcared to return a structure, or can have a structure as one of its parameters. In this case, the behaviour is exactly as expected: if a structure appears in the function's parameter list, the values of all its members are copied into another structure which is local to the function. If a structure is returned, the values of all its members are copied into the target structure in the calling context when the function completes.

Of course, because structures can be very large, using them as arguments to or return values from functions should really only be a last resort.

Avoiding Passing and Returning Structures


Returning Structures

struct Person {
  char *name;
  unsigned int age;
};

struct Person jsb(void)
{
  struct Person p = {"J S Bach", 318};
  return p;
}

Avoiding Structures

struct Person {
  char *name;
  unsigned int age;
};

void jsb(struct Person *p)
{
  p->name = "J S Bach";
  p->age = 318;
}

The left-hand code passes only a pointer to a structure to be filled with information, so the calling and return overheads are very small. The code on the right copies a whole structure on its return, so there will be a substantial overhead every time it is called.


When passing pointers around, be careful to remember what is assigned and how. In the above example, the name field is set to a character constant, not a modifiable string. The example in the left-hand panel will also require that the calling function has allocated sufficient memory for the storage of the information, or it will attempt to write into unallocated memory resulting in the corruption of other variables, or in a segmentation fault. It would be equally possible to make the jsb function reserve memory and return a pointer to the memory thus allocated, as will be seen in the section on memory.

The choice of which of these methods is most appropriate is really a matter of style. However, confusing them, particularly the ones which return pointers, can give rise to subtle bugs including memory leaks which are particularly hard to track down.


Summary:

Structures are essential tools for contstructing readable programs of even modest complexity. They consist of an aggregation of entities of (possibly) differing types, "members", which can be directly accessed using the . operator. Structures are particularly powerful when used with pointers, and the -> operator is provided to help out in this case.

System Requirements

To view this web resource, you will need:

Copyright and Acknowledgements

The tools upon which this course relies are Copyright the Free Software Foundataion where they are made available under the GPL (GNU Public Licence).

The content of this course was derived from that generated by many ex-colleagues at the University of Leeds, Department of Electronics and Electrical Engineering. Much of the content has been reworked, and substantially augmented, but Dr N J Bailey, Centre for Music Technology, The University of Glagsow. This manifestation is Copyright N J Bailey; some of the content is Copyright The University of Leeds.

Diagrams on this resource are drawn in XFig and are rendered by the browser using The University of Hamburg's Simple FIG viewer applet which is Copyright (C) 1996-2002 F.N.Hendrich, hendrich@informatik.uni-hamburg.de.

The source code, programming examples and exercises are all specific to this course, and are Copyright, Dr N J Bailey.

The applet for viewing and demonstrating C programs is Copyright Dr N J Bailey, and is to be found documented and with its source code on the Centre for Music Technology website under Software